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NUCLEIC ACID SEQUENCES TO PROTEINS INVOLVED IN 

TOCOPHEROL SYNTHESIS 



INTRODUCTION 



TECHNICAL FIELD 

1 0 The present invention is directed to nucleic acid and amino acid sequences and 

constructs, and methods related thereto. 
BACKGROUND 

Isoprenoids are ubiquitous compounds found in all living organisms. Plants 
synthesize a diverse array of greater than 22,000 isoprenoids (Connolly and Hill 

15 (1 992) Dictionary of Terpenoids, Chapman and Hall, New York, NY). In plants, 
isoprenoids play essential roles in particular cell functions such as production of 
sterols, contributing to eukaryotic membrane architecture, acyclic polyprenoids found 
in the side chain of ubiquinone and plastoquinone, growth regulators like abscisic 
acid, gibberellins, brassinosteroids or the photosynthetic pigments chlorophylls and 

2 0 carotenoids. Although the physiological role of other plant isoprenoids is less evident, 
like that of the vast array of secondary metabolites, some are known to play key roles 
mediating the adaptative responses to different environmental challenges. In spite of 
the remarkable diversity of structure and function, all isoprenoids originate from a 
single metabolic precursor, isopentenyl diphosphate (IPP) (Wright, (1961) Annu. Rev. 

2 5 Biochem. 20:525-548; and Spurgeon and Porter, (1981) in Biosynthesis of Isoprenoid 
Compounds ., Porter and Spurgeon eds (John Wiley, New York) Vol. 1, ppl-46). 

A number of unique and interconnected biochemical pathways derived from 
the isoprenoid pathway leading to secondary metabolites, including tocopherols, exist 
in chloroplasts of higher plants. Tocopherols not only perform vital functions in 
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plants, but are also important from mammalian nutritional perspectives. In plastids, 
tocopherols account for up to 40% of the total quinone pool- 

Tocopherols and tocotrienols (unsaturated tocopherol derivatives) are well 
known antioxidants, and play an important role in protecting cells from free radical 
5 damage, and in the prevention of many diseases, including cardiac disease, cancer, 
cataracts, retinopathy, Alzheimer's disease, and neurodegeneration, and have been 
shown to have beneficial effects on symptoms of arthritis, and in anti-aging. Vitamin 
E is used in chicken feed for improving the shelf life, appearance, flavor, and 
oxidative stability of meat, and to transfer tocols from feed to eggs. Vitamin E has 

10 been shown to be essential for normal reproduction, improves overall performance, 
and enhances immunocompetence in livestock animals. Vitamin E supplement in 
animal feed also imparts oxidative stability to milk products. 

The demand for natural tocopherols as supplements has been steadily growing 
at a rate of 10-20% for the past three years. At present, the demand exceeds the 

15 supply for natural tocopherols, which are known to be more biopotent than racemic 
mixtures of synthetically produced tocopherols. Naturally occurring tocopherols are 
all rf-stereomers, whereas synthetic ct-tocopherol is a mixture of eight d,l-a- 
- tocopherol isomers, only one of which (12.5%) is identical to the natural rf-a- 
tocopherol. Natural d-a-tocopherol has the highest vitamin E activity (1 .49 IU/mg) 

2 0 when compared to other natural tocopherols or tocotrienols. The synthetic a- 

tocopherol has a vitamin E activity of 1.1 IU/mg. In 1995, the worldwide market for 
raw refined tocopherols was $1020 million; synthetic materials comprised 85-88% of 
the market, the remaining 12-15% being natural materials. The best sources of natural 
tocopherols and tocotrienols are vegetable oils and grain products. Currently, most of 

25 the natural Vitamin E is produced from y-tocopherol derived from soy oil processing, 
which is subsequently converted to a-tocopherol by chemical modification (a- 
tocopherol exhibits the greatest biological activity). 

Methods of enhancing the levels of tocopherols and tocotrienols in plants, 
especially levels of the more desirable compounds that can be used directly, without 
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chemical modification, would be useful to the art as such molecules exhibit better 
functionality and biovailability. 

In addition, methods for the increased production of other isoprenoid derived 
compounds in a host plant cell is desirable. Furthermore, methods for the production of 
5 particular isoprenoid compounds in a host plant cell is also needed. 
SUMMARY OF THE INVENTION 

The present invention is directed to sequences to proteins involved in 
tocopherol synthesis. The polynucleotides and polypeptides of the present invention 
include those derived from prokaiyotic and eukaryotic sources. 
10 * Thus, one aspect of the present invention relates to prenyltransferase, and in 

particular to isolated polynucleotide sequences encoding prenyltransferase proteins 
and polypeptides related thereto. In particular, isolated nucleic acid sequences 
encoding prenyltransferase proteins from bacterial and plant sources are provided. 
In another aspect, the present invention provides isolated polynucleotide 
15 sequences encoding tocopherol cyclase, and polypeptides related thereto. In 

particular, isolated nucleic acid sequences encoding tocopherol cyclase proteins from 
bacterial and plant sources are provided. 

Another aspect of the present invention relates to oligonucleotides which 
include partial or complete prenyltransferase or tocopherol cyclase encoding 
20 sequences. 

It is also an aspect of the present invention to provide recombinant DNA 
constructs which can be used for transcription or transcription and translation 
(expression) of prenyltransferase or tocopherol cyclase. In particular, constructs are 
provided which are capable of transcription or transcription and translation in host 
25 cells. 

In another aspect of the present invention, methods are provided for 
production of prenyltransferase or tocopherol cyclase in a host cell or progeny 
thereof. In particular, host cells are transformed or transfected with a DNA construct 
which can be used for transcription or transcription and translation of 
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prenyltransferase or tocopherol cyclase. The recombinant cells which contain 
prenyltransferase or tocopherol cyclase are also part of the present invention. 

In a further aspect, the present invention relates to methods of using 
polynucleotide and polypeptide sequences to modify the tocopherol content of host 
5 cells, particularly in host plant cells. Plant cells having such a modified tocopherol 
content are also contemplated herein. Methods and cells in which both, 
prenyltransferase and tocopherol cyclase are expressed in a host cell are also part of 
the present invention. 

The modified plants, seeds and oils obtained by the expression of the 
10 prenyltransferase or tocopherol cyclase are also considered part of the invention. 
BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides an amino acid sequence alignment between ATPT2, ATPT3, 
ATPT4, ATPT8, and ATPT12 are performed using ClustalW. 

Figure 2 provides a schematic picture of the expression construct pCGN 10800. 
15 Figure 3 provides a schematic picture of the expression construct pCGN 10801 . 

Figure 4 provides a schematic picture of the expression construct pCGN 10803. 

Figure 5 provides a schematic picture of the construct pCGN10806. 

Figure 6 provides a schematic picture of the construct pCGN10807. 

Figure 7 provides a schematic picture of the construct pCGN10808. 
2 0 Figure 8 provides a schematic picture of the expression construct pCGNl 0809. 

Figure 9 provides a schematic picture of the expression construct pCGNl 08 1 0. 

Figure 10 provides a schematic picture of the expression construct pCGN1081 1. 

Figure 1 1 provides a schematic picture of the expression construct pCGN10812. 

Figure 12 provides a schematic picture of the expression construct pCGN10813. 
2 5 Figure 1 3 provides a schematic picture of the expression construct pCGN 10814. 

Figure 14 provides a schematic picture of the expression construct pCGN10815. 

Figure 15 provides a schematic picture of the expression construct pCGN10816. 

Figure 16 provides a schematic picture of the expression construct pCGN10817. 

Figure 17 provides a schematic picture of the expression construct pCGN10819. 
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Figure 18 provides a schematic picture of the expression construct pCGN10824. 

Figure 19 provides a schematic picture of the expression construct pCGN10825. 

Figure 20 provides a schematic picture of the expression construct pCGN10826. 

Figure 21 provides an amino acid sequence alignment using ClustalW between the 
5 Synechocystis prenyltransferase sequences. 

Figure 22 provides an amino acid sequence of the ATPT2, ATPT3, ATPT4, 
ATPT8, and ATPT12 protein sequences from Arabidopsis and the slrl736, slr0926, 
sill 899, slr0056, and the slrl518 amino acid sequences from Synechocystis, 

Figure 23 provides the results of the enzymatic assay from preparations of 
1 0 wild type Synechocystis strain 6803, and Synechocystis sir 1 736 knockout 

Figure 24 provides bar graphs of HPLC data obtained from seed extracts of 
transgenic Arabidopsis containing pCGN10822, which provides of the expression of 
the ATPT2 sequence, in the sense orientation, from the napin promoter. Provided are 
graphs for alpha, gamma, and delta tocopherols, as well as total tocopherol for 22 
15 transformed lines, as well as a nontransformed (wildtype) control. 

Figure 25 provides a bar graph of HPLC analysis of seed extracts from 
Arabidopsis plants transformed with pCGNl 0803 (35S-ATPT2, in the antisense 
orientation), pCGNl 0822 (line 1 625, napin ATPT2 in the sense orientation), 
pCGN10809 (line 1627, 35S-ATPT3 in the sense orientation), a nontransformed (wt) 
2 0 control, and an empty vector transformed control. 

Figure 26 shows total tocopherol levels measured in T# Arabidopsis seed of 

line. 

Figure 27 shows total tocopherol levels measured in T# Arabidopsis seed of 

line. 

25 Figure 28 shows total tocopherol levels measured in developing canola seed of 

line 10822-1. 

Figure 29: shows results of phytyl prenyltransferase activity assay using 
Synechocystis wild type and slrl737 knockout mutant membrane preparations. 
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Figure 30 is the chromatograph from an HPLC analysis of Synechocystis 
extracts. 

Figure 31 is a sequence alignment of \h& Arabidopsis homologue with the 
sequence of the public database. 
5 Figure 32 shows the results of hydropathic analysis of slrl737 

Figure 33 shows the results of hydropathic analysis of the Arabidopsis 
homologue of sir 1737. 

Figure 34 shows the catalytic mechanism of various cyclase enzymes 

Figure 35 is a sequence alignment of slrl737, slrl737 Arabidopsis homologue 
1 0 and the Arabidopsis chalcone isomerase. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides, inter alia, compositions and methods for 
altering (for example, increasing and decreasing) the tocopherol levels and/or 
modulating their ratios in host cells. In particular, the present invention provides 
1 5 polynucleotides, polypeptides, and methods of use thereof for the modulation of 
tocopherol content in host plant cells. 

The biosynthesis of a-tocopherol in higher plants involves condensation of 
homogentisic acid and phytylpyrophosphate to form 2-methyl-6 phytylbenzoquinol 
that can, by cyclization and subsequent methylations (Fiedler et al., 1982, Planta, 155: 
20 511-515, Soil et al., 1980, Arch Biochem. Biophys. 204: 544-550, Marshall et al., 

1985 Phytochem., 24: 1705-171 1, all of which are herein incorporated by reference in 
their entirety), form various tocopherols. 

The Arabidopsis pds2 mutant identified and characterized by Norris et al 

* 

(1995), is deficient in tocopherol and plastiquinone-9 accumulation. Further genetic 

< 

2 5 and biochemical analysis suggested that the protein encoded by PDS2 may be 

responsible for the prenylation of homogentisic acid. The PDS2 locus identified by 
Norris et ah (1995) has been hypothesized to possibly encode the tocopherol phytyl- 
prenyltransferase, as the pds2 mutant fails to accumulate tocopherols. 
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Norris et ah (1995) determined that in Arabidopsis pds2 lies at the top of 
chromosome 3, approximately 7 centimorgans above long hypocotyl2, based on the 
genetic map. ATPT2 is located on chromosome 2 between 36 and 41 centimorgans, 
lying on BAC F19F24, indicating that ATPT2 does not correspond to PDS2. Thus, it 
5 is an aspect of the present invention to provide novel polynucleotides and 

■ 

polypeptides involved in the prenylation of homogentisic acid. This reaction may be 
a rate limiting step in tocopherol biosynthesis, and this gene has yet to be isolated. 

U.S. Patent No. 5,432,069 describes the partial purification and 
characterization of tocopherol cyclase from Chlorella protothecoides, Dimaliella 
10 salina and wheat. The cyclase described as being glycine rich, water soluble and with 
a predicted MW of 48-50kDa. However, only limited peptide fragment sequences 
were available. 

In one aspect, the present invention provides polynucleotide and polypeptide 
sequences involved in the prenylation of straight chain and aromatic compounds. 

15 Straight chain prenyltransferases as used herein comprises sequences which encode 
proteins involved in the prenylation of straight chain compounds, including, but not 
limited to, geranyl geranyl pyrophosphate and farnesyl pyrophosphate. Aromatic 
prenyltransferases, as used herein, comprises sequences which encode proteins 
involved in the prenylation of aromatic compounds, including, but not limited to, 

2 0 menaquinone, ubiquinone, chlorophyll, and homogentisic acid. The prenyltransferase 
of the present invention preferably prenylates homogentisic acid. 

In another aspect, the invention provides polynucleotide and polypeptide 
sequences to tocopherol cyclization enzymes. The 2,3-dimethyl-S-phytylplastoquinol 
cyclase (tocopherol cyclase) is responsible for the cyclization of 2,3-dimethyl-5- 

2 5 phytylplastoquinol to tocopherol. 

Isolated Polynucleotides, Proteins, and Polypeptides 

A first aspect of the present invention relates to isolated prenyltransferase 
polynucleotides. Another aspect of the present invention relates to isolated tocopherol 
cyclase polynucleotides. The polynucleotide sequences of the present invention 



* 
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include isolated polynucleotides that encode the polypeptides of the invention having 
a deduced amino acid sequence selected from the group of sequences set forth in the 
Sequence Listing and to other polynucleotide sequences closely related to such 
sequences and variants thereof. 
5 The invention provides a polynucleotide sequence identical over its entire 

length to each coding sequence as set forth in the Sequence Listing. The invention 
also provides the coding sequence for the mature polypeptide or a fragment thereof, as 
well as the coding sequence for the mature polypeptide or a fragment thereof in a 
reading frame with other coding sequences, such as those encoding a leader or 

10 secretory sequence, a pre-, pro-, or prepro- protein sequence. The polynucleotide can 
also include non-coding sequences, including for example, but not limited to, non- 
coding 5' and 3' sequences, such as the transcribed, untranslated sequences, 
termination signals, ribosome binding sites, sequences that stabilize mRNA, introns, 
polyadenylation signals, and additional coding sequence that encodes additional 

1 5 amino acids. For example, a marker sequence can be included to facilitate the 

purification of the fused polypeptide. Polynucleotides of the present invention also 
include polynucleotides comprising a structural gene and the naturally associated 
sequences that control gene expression. 

The invention also includes polynucleotides of the formula: 

20 X-(R0 Q -(R 2 )-M-Y 

wherein, at the 5' end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, R 1 
and R 3 are any nucleic acid residue, n is an integer between 1 and 3000, preferably 
between 1 and 1 000 and R 2 is a nucleic acid sequence of the invention, particularly a 
nucleic acid sequence selected from the group set forth in the Sequence Listing and 

2 5 preferably those of SEQ ID NOs: 1, 3, 5, 7, 8, 10, 1 1, 13-1 6, 1 8, 23, 29, 36, and 38. 
In the formula, R 2 is oriented so that its 5* end residue is at the left, bound to R„ and 
its 3* end residue is at the right, bound to R 3 . Any stretch of nucleic acid residues 
denoted by either R group, where R is greater than 1 , may be either a heteropolymer 
or a homopolymer, preferably a heteropolymer. 
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The invention also relates to variants of the polynucleotides described herein 
that encode for variants of the polypeptides of the invention. Variants that are 
fragments of the polynucleotides of the invention can be used to synthesize full-length 
polynucleotides of the invention. Preferred embodiments are polynucleotides 
5 encoding polypeptide variants wherein 5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid 
residues of a polypeptide sequence of the invention are substituted, added or deleted, 
in any combination. Particularly preferred are substitutions, additions, and deletions 
that are silent such that they do not alter the properties or activities of the 
polynucleotide or polypeptide. 

1 0 Further preferred embodiments of the invention that are at least 50%, 60%, or 

70% identical over their entire length to a polynucleotide encoding a polypeptide of 
the invention, and polynucleotides that are complementary to such polynucleotides. 
More preferable are polynucleotides that comprise a region that is at least 80% 
identical over its entire length to a polynucleotide encoding a polypeptide of the 

15 invention and polynucleotides that are complementary thereto. In this regard, 
polynucleotides at least 90% identical over their entire length are particularly 
preferred, those at least 95% identical are especially preferred. Further, those with at 
least 97% identity are highly preferred and those with at least 98% and 99% identity 
are particularly highly preferred, with those at least 99% being the most highly 

20 preferred. 

Preferred embodiments are polynucleotides that encode polypeptides that 
retain substantially the same biological function or activity as the mature polypeptides 
encoded by the polynucleotides set forth in the Sequence Listing. 

The invention further relates to polynucleotides that hybridize to the above- 
2 5 described sequences. In particular, the invention relates to polynucleotides that 

hybridize under stringent conditions to the above-described polynucleotides. As used 
herein, the terms "stringent conditions" and "stringent hybridization conditions" mean 
that hybridization will generally occur if there is at least 95% and preferably at least 
97% identity between the sequences. An example of stringent hybridization 
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conditions is overnight incubation at 42°C in a solution comprising 50% formamide, 
5x SSC (150 mM NaCl, 15 mM trisodiura citrate), 50 mM sodium phosphate (pH 
7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 micrograms/milliliter 
denatured, sheared salmon sperm DNA, followed by washing the hybridization 
5 support in O.lx SSC at approximately 65°C. Other hybridization and wash conditions 
are well known and are exemplified in Sambrook, et al., Molecular Cloning: A 
Laboratory Manual, Second Edition, cold Spring Harbor, NY (1989), particularly 

■ 

Chapter 1 1 . 

The invention also provides a polynucleotide consisting essentially of a 

1 0 polynucleotide sequence obtainable by screening an appropriate library containing the 
complete gene for a polynucleotide sequence set for in the Sequence Listing under 
stringent hybridization conditions with a probe having the sequence of said 
polynucleotide sequence or a fragment thereof; and isolating said polynucleotide 
sequence. Fragments useful for obtaining such a polynucleotide include, for example, 

15 probes and primers as described herein. 

As discussed herein regarding polynucleotide assays of the invention, for 
example, polynucleotides of the invention can be used as a hybridization probe for 
RNA, cDNA, or genomic DNA to isolate full length cDNAs or genomic clones 
encoding a polypeptide and to isolate cDNA or genomic clones of other genes that 

2 0 have a high sequence similarity to a polynucleotide set forth in the Sequence Listing. 
Such probes will generally comprise at least 15 bases. Preferably such probes will 
have at least 30 bases and can have at least 50 bases. Particularly preferred probes 
will have between 30 bases and 50 bases, inclusive. 

The coding region of each gene that comprises or is comprised by a 

2 5 polynucleotide sequence set forth in the Sequence Listing may be isolated by 

screening using a DNA sequence provided in the Sequence Listing to synthesize an 
oligonucleotide probe. A labeled oligonucleotide having a sequence complementary 
to that of a gene of the invention is then used to screen a library of cDNA, genomic 
DNA or mRNA to identify members of the library which hybridize to the probe. For 
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example, synthetic oligonucleotides are prepared which correspond to the 
prenyltransferase or tocopherol cyclase EST sequences. The oligonucleotides are 
used as primers in polymerase chain reaction (PCR) techniques to obtain 5' and 3' 
terminal sequence of prenyltransferase or tocopherol cyclase genes. Alternatively, 
5 where oligonucleotides of low degeneracy can be prepared from particular 

prenyltransferase or tocopherol cyclase peptides, such probes may be used directly to 
screen gene libraries for prenyltransferase or tocopherol cyclase gene sequences. In 
particular, screening of cDNA libraries in phage vectors is useful in such methods due 
to lower levels of background hybridization. 

1 0 Typically, a prenyltransferase or tocopherol cyclase sequence obtainable from 

the use of nucleic acid probes will show 60-70% sequence identity between the target 
prenyltransferase or tocopherol cyclase sequence and the encoding sequence used as a 
probe. However, lengthy sequences with as little as 50-60% sequence identity may 
also be obtained. The nucleic acid probes may be a lengthy fragment of the nucleic 

15 acid sequence, or may also be a shorter, oligonucleotide probe. When longer nucleic 
acid fragments are employed as probes (greater than about 100 bp), one may screen at 
lower stringencies in order to obtain sequences from the target sample which have 20- 
50% deviation (i.e., 50-80% sequence homology) from the sequences used as probe. 
Oligonucleotide probes can be considerably shorter than the entire nucleic acid 

2 0 sequence encoding an prenyltransferase or tocopherol cyclase enzyme, but should be 
at least about 10, preferably at least about 15, and more preferably at least about 20 
nucleotides. A higher degree of sequence identity is desired when shorter regions are 
used as opposed to longer regions. It may thus be desirable to identify regions of 
highly conserved amino acid sequence to design oligonucleotide probes for detecting 

2 5 and recovering other related prenyltransferase or tocopherol cyclase genes. Shorter 
probes are often particularly useful for polymerase chain reactions (PCR), especially 
when highly conserved sequences can be identified. (See, Gould, et cd. 9 PNAS USA 
(1989)5^:1934-1938.). . 
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Another aspect of the present invention relates to prenyltransferase or 
tocopherol cyclase polypeptides. Such polypeptides include isolated polypeptides set 
forth in the Sequence Listing, as well as polypeptides and fragments thereof, 
particularly those polypeptides which exhibit prenyltransferase or tocopherol cyclase 
5 activity and also those polypeptides which have at least 50%, 60% or 70% identity, 
preferably at least 80% identity, more preferably at least 90% identity, and most 
preferably at least 95% identity to a polypeptide sequence selected from the group of 
sequences set forth in the Sequence Listing, and also include portions of such 
polypeptides, wherein such portion of the polypeptide preferably includes at least 30 

1 0 amino acids and more preferably includes at least 50 amino acids. 

"Identity", as is well understood in the art, is a relationship between two or 
more polypeptide sequences or two or more polynucleotide sequences, as determined 
by comparing the sequences. In the art, "identity" also means the degree of sequence 
relatedness between polypeptide or polynucleotide sequences, as determined by the 

15 match between strings of such sequences. "Identity " can be readily calculated by 
known methods including, but not limited to, those described in Computational 
Molecular Biology, Lesk, A.M., ed., Oxford University Press, New York (1988); 
Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, 
New York, 1993; Computer Analysis of Sequence Data, Parti, Griffin, A.M. and 

2 0 Griffin, H.G., eds., Humana Press, New Jersey (1994); Sequence Analysis in 
Molecular Biology, von Heinje, G., Academic Press (1987); Sequence Analysis 
Primer, Gribskov, M. and Devereux, J., eds., Stockton Press, New York (1991); and 
Carillo, H., and Lipman, D., SIAM J Applied Math, 48:1073 (1988). Methods to 
determine identity are designed to give the largest match between the sequences 

2 5 tested. Moreover, methods to determine identity are codified in publicly available 
programs. Computer programs which can be used to determine identity between two 
sequences include, but are not limited to, GCG (Devereux, L, et al., Nucleic Acids 
Research 12(1):387 (1984); suite of five BLAST programs, three designed for 
nucleotide sequences queries (BLASTN, BLASTX, and TBLASTX) and two 
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designed for protein sequence queries (BLASTP and TBLASTN) (Coulson, Trends in 
Biotechnology, 12: 76-80 (1994); Birren, etal, Genome Analysis, 1: 543-559 (1997)). 
The BLAST X program is publicly available from NCBI and other sources (BLAST 
Manual, Altschul, S., et al, NCBI NLM NM, Bethesda, MD 20894; Altschul, S., et 
5 al, J. Mol Biol, 215:403-410 (1990)). The well known Smith Waterman algorithm 
can also be used to determine identity. 

Parameters for polypeptide sequence comparison typically include the 
following: 

Algorithm: Needleman and Wunsch, J. Mol Biol 48:443-453 (1970) 
1 0 Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl 

Acad Sci USA 89:10915-10919 (1992) 

« 

Gap Penalty: 12 
Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as 
15 the "gap" program from Genetics Computer Group, Madison Wisconsin. The above 
parameters^along with no penalty for end gap are the default parameters for peptide 
comparisons. 

Parameters for polynucleotide sequence comparison include the following: 
Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970) 
2 0 Comparison matrix: matches = +10; mismatches = 0 

Gap Penalty: 50 
Gap Length Penalty: 3 

A program which can be used with these parameters is publicly available as 
the "gap" program from Genetics Computer Group, Madison Wisconsin. The above 
25 parameters are the default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: 

wherein, at the jimino terminus, X is hydrogen, and at the carboxyl terminus, Y is 
hydrogen or a metal, R, and R 3 are any amino acid residue, n is an integer between 1 
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and 1000, and R 2 is an amino acid sequence of the invention, particularly an amino 
acid sequence selected from the group set forth in the Sequence Listing and preferably 
those encoded by the sequences provided in SEQ ID NOs: 2, 4, 6, 9, 12, 17, 19-22, 
24-28, 30, 32-35, 37, and 39. In the formula, R 2 is oriented so that its amino terminal 
5 residue is at the left, bound to R l5 and its carboxy terminal residue is at the right, 
bound to R 3 . Any stretch of amino acid residues denoted by either R group, where R 
is greater than 1, may be either a heteropolymer or a homopolymer, preferably a 
heteropolymer. 

Polypeptides of the present invention include isolated polypeptides encoded by 
10 a polynucleotide comprising a sequence selected from the group of a sequence 
contained in the Sequence Listing set forth herein . 

The polypeptides of the present invention can be mature protein or can be part 
of a fusion protein. 

Fragments and variants of the polypeptides are also considered to be a part of 
15 the invention. A fragment is a variant polypeptide which has an amino acid sequence 
that is entirely the same as part but not all of the amino acid sequence of the 
previously described polypeptides. The fragments can be "free-standing" or 
comprised within a larger polypeptide of which the fragment forms a part or a region, 
most preferably as a single continuous region. Preferred fragments are biologically 
2 0 active fragments which are those fragments that mediate activities of the polypeptides 
of the invention, including those with similar activity or improved activity or with a 
decreased activity. Also included are those fragments that antigenic or immunogenic 
in an animal, particularly a human. 

Variants of the polypeptide also include polypeptides that vary from the 
2 5 sequences set forth in the Sequence Listing by conservative amino acid substitutions, 
substitution of a residue by another with like characteristics. In general, such 
substitutions are among Ala, Val, Leu and He; between Ser and Thr; between Asp and 
Glu; between Asn and Gin; between Lys and Arg; or between Phe and Tyr. 
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Particularly preferred are variants in which 5 to 10; 1 to 5; I to 3 or one amino acid(s) 
are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to 
produce the corresponding full length polypeptide by peptide synthesis. Therefore, 
5 these variants can be used as intermediates for producing the full-length polypeptides 
of the invention. 

The polynucleotides and polypeptides of the invention can be used, for 
example, in the transformation of host cells, such as plant host cells, as further 
discussed herein. 

1 0 The invention also provides polynucleotides that encode a polypeptide that is a 

mature protein plus additional amino or carboxyl-tenninal amino acids, or amino 
acids within the mature polypeptide (for example, when the mature form of the 
protein has more than one polypeptide chain). Such sequences can, for example, play 
a role in the processing of a protein from a precursor to a mature form, allow protein 

15 transport, shorten or lengthen protein half-life, or facilitate manipulation of the protein 
in assays or production. It is contemplated that cellular enzymes can be used to 
remove any additional amino acids from the mature protein. 

A precursor protein, having the mature form of the polypeptide fused to one or 
more prosequences may be an inactive fonn of the polypeptide. The inactive 

2 0 precursors generally are activated when the prosequences are removed. Some or all of 
the prosequences may be removed prior to activation. Such precursor protein are 
generally called proproteins. 
Plant Constructs and Methods of Use 

Of particular interest is the use of the nucleotide sequences in recombinant 

2 5 DNA constructs to direct the transcription or transcription and translation (expression) 
of the prenyltransferase or tocopherol cyclase sequences of the present invention in a 
host plant cell. The expression constructs generally comprise a promoter functional in 
a host plant cell operably linked to a nucleic acid sequence encoding a 
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prenyltransferase or tocopherol cyclase of the present invention and a transcriptional 
termination region functional in a host plant cell 

A first nucleic acid sequence is "operably linked" or "operably associated" 
with a second nucleic acid sequence when the sequences are so arranged that the first 
5 nucleic acid sequence affects the function of the second nucleic-acid sequence. 
Preferably, the two sequences are part of a single contiguous nucleic acid molecule 
and more preferably are adjacent. For example, a promoter is operably linked to a 
gene if the promoter regulates or mediates transcription of the gene in a cell. 

Those skilled in the art will recognize that there are a number of promoters 
1 0 which are functional in plant cells, and have been described in the literature. 
Chloroplast and plastid specific promoters, chloroplast or plastid functional 
promoters, and chloroplast or plastid operable promoters are also envisioned. 

One set of plant functional promoters are constitutive promoters such as the 
CaMV35S or FMV35S promoters that yield high levels of expression in most plant 
15 organs. Enhanced or duplicated versions of the CaMV35S and FMV35S promoters 
are useful in the practice of this invention (Odell, et al. (1985) Nature 313:810-812; 
Rogers, U.S. Patent Number 5,378, 619). In addition, it may also be preferred to bring 
about expression of the prenyltransferase or tocopherol cyclase gene in specific tissues 
of the plant, such as leaf, stem, root, tuber, seed, fruit, etc., and the promoter chosen 

* 

2 0 should have the desired tissue and developmental specificity. 

Of particular interest is the expression of the nucleic acid sequences of the 
present invention from transcription initiation regions which are preferentially 
expressed in a plant seed tissue. Examples of such seed preferential transcription 
initiation sequences include those sequences derived from sequences encoding plant 

2 5 storage protein genes or from genes involved in fatty acid biosynthesis in oilseeds. 
Examples of such promoters include the 5' regulatory regions from such genes as 
napin (Kridl et al. 9 Seed Set Res. i;209:219 (1991)), phaseolin, zein, soybean trypsin 
inhibitor, ACP, stearoyl-ACP desaturase, soybean a* subunit of (J-conglycinin (soy 
7s, (Chen et al, Proc. Natl Acad ScL, 83:8560-8564 (1986))) and oleosin. 
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It may be advantageous to direct the localization of proteins conferring 
prenyitransferase or tocopherol cyclase to a particular subcellular compartment, for 
example, to the mitochondrion, endoplasmic reticulum, vacuoles, chloroplast or other 
plastidic compartment For example, where the genes of interest of the present 
5 invention will be targeted to plastids, such as chloroplasts, for expression, the 
constructs will also employ the use of sequences to direct the gene to the plastid. 
Such sequences are referred to herein as chloroplast transit peptides (CTP) or plastid 
• transit peptides (PTP). In this manner, where the gene of interest is not directly 
inserted into the plastid, the expression construct will additionally contain a gene 

1 0 encoding a transit peptide to direct the gene of interest to the plastid. The chloroplast 
transit peptides may be derived from the gene of interest, or may be derived from a 
heterologous sequence having a CTP. Such transit peptides are known in the art. See, 
for example, Von Heijne etal (1991) Plant Mol Biol Rep. P:104-126; Clark etal 
(1989) J. Biol Chem. 254:17544-17550; della-Cioppa et al (1987) Plant Physiol 

15 54:965-968; Romer et al (1993) Biochem. Biophys. Res Cornmun 795:1414-1421; 
and, Shah etal (1986) Science 233:478-481. 

Depending upon the intended use, the constructs may contain the nucleic acid 
sequence which encodes the entire prenyitransferase or tocopherol cyclase protein, or 
a portion thereof. For example, where antisense inhibition of a given 

2 0 prenyitransferase or tocopherol cyclase protein is desired, the entire prenyitransferase 
or tocopherol cyclase sequence is not required Furthermore, where prenyitransferase 
or tocopherol cyclase sequences used in constructs are intended for use as probes, it 
may be advantageous to prepare constructs containing only a particular portion of a 
prenyitransferase or tocopherol cyclase encoding sequence, for example a sequence 

25 which is discovered to encode a highly conserved prenyitransferase or tocopherol 
cyclase region. 

The skilled artisan will recognize that there are various methods for the 
inhibition of expression of endogenous sequences in a host cell. Such methods 
include, but are not limited to, antisense suppression (Smith, et al (1988) Nature 
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334:724-726) , co-suppression (Napoli, et al (1989) Plant Cell 2:279-289), 
ribozymes (PCT Publication WO 97/10328), and combinations of sense and antisense 
Waterhouse, et al (1998) Proc. Natl Acad Set USA 95:13959-13964. Methods for 
the suppression of endogenous sequences in a host cell typically employ the 
5 transcription or transcription and translation of at least a portion of the sequence to be 
suppressed. Such sequences may be homologous to coding as well as non-coding 
regions of the endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression 
constructs of this invention as well. Transcript termination regions may be provided 

10 by the DNA sequence encoding the prenyltransferase or tocopherol cyclase or a 

convenient transcription termination region derived from a different gene source, for 
example, the transcript termination region which is naturally associated with the 
transcript initiation region. The skilled artisan will recognize that any convenient 
transcript termination region which is capable of. terminating transcription in a plant 

1 5 cell may be employed in the constructs of the present invention. 

Alternatively, constructs may be prepared to direct the expression of the 
prenyltransferase or tocopherol cyclase sequences directly from the host plant cell 
plastid. Such constructs and methods are known in the art and are generally 

■ 

described, for example, in Svab, et al (1990) Proc. Natl Acad Sci. USA 87:8526- 
20 8530 and Svab and Maliga (1993) Proc. Natl Acad Set USA 90:913-917 and in U.S. 
Patent Number 5,693,507. 

The prenyltransferase or tocopherol cyclase constructs of the present invention 
can be used in transformation methods with additional constructs providing for the 
expression of other nucleic acid sequences encoding proteins involved in the 
25 production of tocopherols, or tocopherol precursors such as homogentisic acid and/or 
phytylpyrophosphate. Nucleic acid sequences encoding proteins involved in the 
production of homogentisic acid are known in the art, and include but not are limited 
to, 4-hydroxyphenylpyruvate dioxygenase (HPPD, EC 1.13.11.27) described for 
example, by Garcia, et al. ((1999) Plant Physiol. 119(4):1507-1516), mono or 
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bifunctional tyrA (described for example by Xia, et al. (1992) J. Gen Microbiol. 
138:1309-1316, and Hudson, et al. (1984) J. Mol Biol. 180:1023-1051), Oxygenase, 
4-hydroxyphenylpyruvate di- (9CI), 4-Hydroxyphenylpyruvate dioxygenase; 
p-Hydroxyphenylpyruvate dioxygenase; p-Hydroxyphenylpyruvate hydroxylase; 
5 p-Hydroxyphenylpyruvate oxidase; p-Hydroxyphenylpyruvic acid hydroxylase; 
p-Hydroxyphenylpyruvic hydroxylase; p-Hydroxyphenylpyruvic oxidase), 
4-hydroxyphenylacetate, NAD(P)H:oxygen oxidoreductase (1-hydroxylating); 
4-hydroxyphenylacetate 1-monooxygenase, and the like. In addition, constructs for 
the expression of nucleic acid sequences encoding proteins involved in the production 
of phytylpyrophosphate can also be employed with the prenyltransferase or tocopherol 
cyclase constructs of the present invention. Nucleic acid sequences encoding proteins 
involved in the production of phytylpyrophosphate are known in the art, and include, 
but are not limited to geranylgeranylpyrophosphate synthase (GGPPS), 
geranylgeranylpyrophosphate ■ reductase (GGH), l-deoxyxylulose-5-phosphate 
synthase, 1- deoxy-D-xylolose-5-phosphate reductoisomerase, 4-diphosphocytidyl-2- 
C-methylerythritol synthase, isopentyl pyrophosphate isomerase. 

The prenyltransferase or tocopherol cyclase sequences of the present invention 
find use in the preparation of transformation constructs having a second expression 
cassette for the expression of additional sequences involved in tocopherol 
biosynthesis. Additional tocopherol biosynthesis sequences of interest in the present 
invention include, but are not limited to gamma-tocpherol methyltransferase 
(Shintani, et al (1998) Science 282(5396):2098-2100), tocopherol cyclase, and 
tocopherol methyltransferase. 

A plant cell, tissue, organ, or plant into which the recombinant DNA 
constructs containing the expression constructs have been introduced is considered 
transformed, transfected, or transgenic. A transgenic or transformed cell or plant also 
includes progeny of the cell or plant and progeny produced from a breeding program 
employing such a transgenic plant as a parent in a cross and exhibiting an altered 
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phenotype resulting from the presence of a prenyltransferase or tocopherol cyclase 
nucleic acid sequence. 

Plant expression or transcription constructs having a prenyltransferase or 
tocopherol cyclase as the DNA sequence of interest for increased or decreased 
5 expression thereof may be employed with a wide variety of plant life, particularly, 
plant life involved in the production of vegetable oils for edible and industrial uses. 
Particularly preferred plants for use in the methods of the present invention include, 
but are not limited to: Acacia, alfalfa, aneth, apple, apricot, artichoke, arugula, 
asparagus, avocado, banana, barley, beans, beet, blackberry, blueberry, broccoli, 

1 0 brussels sprouts, cabbage, canola, cantaloupe, carrot, cassava, cauliflower, celery, 

cherry, chicory, cilantro, citrus, Clementines, coffee, corn, cotton, cucumber, Douglas 
fir, eggplant, endive, escarole, eucalyptus, fennel, figs, garlic, gourd, grape, grapefruit, 
honey dew, jicama, kiwifruit, lettuce, leeks, lemon, lime, Loblolly pine, mango, 
melon, mushroom, nectarine, nut, oat, oil palm, oil seed rape, okra, onion, orange, an 

15 ornamental plant, papaya, parsley, pea, peach, peanut, pear, pepper, persimmon, pine, 
pineapple, plantain, plum, pomegranate, poplar, potato, pumpkin, quince, radiata pine, 
radicchio, radish, raspberry, rice, rye, sorghum, Southern pine, soybean, spinach, 
squash, strawberry, sugarbeet, sugarcane, sunflower, sweet potato, sweetgum, 
tangerine, tea, tobacco, tomato, triticale, turf, turnip, a vine, watermelon, wheat, yams, 

2 0 and zucchini. 

Most especially preferred are temperate oilseed crops. Temperate oilseed 
crops of interest include, but are not limited to, rapeseed (Canola and High Erucic 
Acid varieties), sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, 
and corn. Depending on the method for introducing the recombinant constructs into 

25 the host cell, other DNA sequences may be required. Importantly, this invention is 
applicable to dicotyledyons and monocotyledons species alike and will be readily 
applicable to new and/or improved transformation and regulation techniques. 

Of particular interest, is the use of prenyltransferase or tocopherol cyclase 
constructs in plants to produce plants or plant parts, including, but not limited to 
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leaves, stems, roots, reproductive, and seed, with a modified content of tocopherols in 
plant parts having transformed plant cells. 

For immunological screening, antibodies to the protein can be prepared by 
injecting rabbits or mice with the purified protein or portion thereof, such methods of 
5 preparing antibodies being well known to those in the art Either monoclonal or 
polyclonal antibodies can be produced, although typically polyclonal antibodies are 
more useful for gene isolation. Western analysis may be conducted to determine that 
a related protein is present in a crude extract of the desired plant species, as 
determined by cross-reaction with the antibodies to the encoded proteins. When 

10 cross-reactivity is observed, genes encoding the related proteins are isolated by 
screening expression libraries representing the desired plant species. Expression 
libraries can be constructed in a variety of commercially available vectors, including 
lambda gtl 1, as described in Sambrook, et al (Molecular Cloning: A Laboratory 
Manual Second Edition (1989) Cold Spring Harbor Laboratory, Cold Spring Harbor, 

15 New York). 

To confirm the activity and specificity of the proteins encoded by the 
identified nucleic acid sequences as prenyltransferase or tocopherol cyclase enzymes, 
in vitro assays are performed in insect cell cultures using baculovirus expression 
systems. Such baculovirus expression systems are known in the art and are described 
20 by Lee, et al U.S. Patent Number 5,348,886, the entirety of which is herein 
incorporated by reference. 

In addition, other expression constructs may be prepared to assay for protein 
activity utilizing different expression systems. Such expression constructs are 
transformed into yeast or prokaryotic host and assayed for prenyltransferase or 
2 5 tocopherol cyclase activity. Such expression systems are known in the art and are 
readily available through commercial sources. 

In addition to the sequences described in die present invention, DNA coding 
sequences useful in the present invention can be derived from algae, fungi, bacteria, 
mammalian sources, plants, etc. Homology searches in existing databases using 
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signature sequences corresponding to conserved nucleotide and amino acid sequences 
of prenyltransferase or tocopherol cyclase can be employed to isolate equivalent, 
related genes from other sources such as plants and microorganisms. Searches in 
EST databases can also be employed. Furthermore, the use of DNA sequences 
5 encoding enzymes functionally enzymatically equivalent to those disclosed herein, 
wherein such DNA sequences are degenerate equivalents of the nucleic acid 
sequences disclosed herein in accordance with the degeneracy of the genetic code, is 
also encompassed by the present invention. Demonstration of the functionality of 
coding sequences identified by any of these methods can be carried out by 
complementation of mutants of appropriate organisms, such as Synechocystis, 
Shewanella, yeast, Pseudomonas, Rhodobacteria, etc., that lack specific biochemical 
reactions, or that have been mutated. The sequences of the DNA coding regions can 
be optimized by gene resynthesis, based on codon usage, for maximum expression in 
particular hosts. 

For the alteration of tocopherol production in a host cell, a second expression 
construct can be used in accordance with the present invention. For example, the 
prenyltransferase or tocopherol cyclase expression construct can be introduced into a 
host cell in conjunction with a second expression construct having a nucleotide 
sequence for a protein involved in tocopherol biosynthesis. 

The method of transformation in obtaining such transgenic plants is not critical 
to the instant invention, and various methods of plant transformation are currently 
available. Furthermore, as newer methods become available to transform crops, they 
may also be directly applied hereunder. For example, many plant species naturally 
susceptible to Agrobacterium infection may be successfully transformed via tripartite 
or binary vector methods of Agi-obacterium mediated transformation. In many 
instances, it will be desirable to have the construct bordered on one or both sides by 
T-DNA, particularly having the left and right borders, more particularly the right 
border. This is particularly useful when the construct uses A. tumefaciens or A. 
rhizogenes as a mode for transformation, although the T-DNA borders may find use 
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with other modes of transformation. In addition, techniques of microinjection, DNA 
particle bombardment, and electroporation have been developed which allow for the 
transformation of various monocot and dicot plant species. 

Normally, included with the DNA construct will be a structural gene having 
5 the necessary regulatory regions for expression in a host and providing for selection of 
transformant cells. The gene may provide for resistance to a cytotoxic agent, e.g. 
antibiotic, heavy metal, toxin, etc., complementation providing prototrophy to an 
auxotrophic host, viral immunity or the like. Depending upon the number of different 
host species the expression construct or components thereof are introduced, one or 
1 0 more markers may be employed, where different conditions for selection are used for 
the different hosts. 

Where Agrobacterium is used for plant cell transformation, a vector may be 
used which may be introduced into the Agrobacterium host for homologous 
recombination with T-DNA or the Ti- or Ri-plasmid present in the Agrobacterium 
15 host The Ti- or Ri-plasmid containing the T-DNA for recombination may be armed 
(capable of causing gall formation) or disarmed (incapable of causing gall formation), 
the latter being permissible, so long as the vir genes are present in the transformed 
Agrobacterium host. The armed plasmid can give a mixture of normal plant cells and 
gall. 

20 In some instances where Agrobacterium is used as the vehicle for transforming 

host plant cells, the expression or transcription construct bordered by the T-DNA 
border region(s) will be inserted into a broad host range vector capable of replication 
in E. coli and Agrobacterium, there being broad host range vectors described in the 
literature. Commonly used is pRK2 or derivatives thereof. See, for example, Ditta, et 

25 aL, {Proc. Nat Acad. Scl, USA. (1980) 77:7347-7351) and EPA 0 120 515, which 
are incorporated herein by reference. Alternatively, one may insert the sequences to 
be expressed in plant cells into a vector containing separate replication sequences, one 
of which stabilizes the vector in E coli, and the other in Agrobacterium. See, for 
example, McBride, et al {Plant Mol Biol (1990) 74:269-276), wherein the pRiHRI 
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(Jouanin, etal, Mol Gen. Genet (1985) 201:370-374) origin of replication is utilized 
and provides for added stability of the plant expression vectors in host Agrobacterium 
cells. 

Included with the expression construct and the T-DNA will be one or more 
5 markers, which allow for selection of transformed Agrobacterium and transformed 
plant cells. A 

number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The 
particular marker employed is not essential to this invention, one or another marker 

10 being preferred depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium, explants may be 
combined and incubated with the transformed Agrobacterium for sufficient time for 
transformation, the bacteria killed, and the plant cells cultured in an appropriate 
selective medium. Once callus forms, shoot formation can be encouraged by 

15 employing the appropriate plant hormones in accordance with known methods and the 
shoots transferred to rooting medium for regeneration of plants. The plants may then 
be grown to seed and the seed used to establish repetitive generations and for isolation 
of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention 
2 0 which contain multiple expression constructs. Any means for producing a plant 

comprising a construct having a DNA sequence encoding the expression construct of 
the present invention, and at least one other construct having another DNA sequence 
encoding an enzyme are encompassed by the present invention. For example, the 
expression construct can be used to transform a plant at the same time as the second 
2 5 construct either by inclusion of both expression constructs in a single transformation 
vector or by using separate vectors, each of which express desired genes. The second 
construct can be introduced into a plant which has already been transformed with the 
prenyltransferase or tocopherol cyclase expression construct, or alternatively, 
transformed plants, one expressing the prenyltransferase or tocopherol cyclase 
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construct and one expressing the second construct, can be crossed to bring the 
constructs together in the same plant 

Transgenic plants of the present invention may be produced from tissue 
culture, and subsequent generations grown from seed. Alternatively, transgenic plants 
5 may be grown using apomixis. Apomixis is a genetically controlled method of 
reproduction in plants where the embryo is formed without union of an egg and a 
sperm. There are three basic types of apomictic reproduction: 1) apospoiy where the 
embryo develops from a chromosomally unreduced egg in an embryo sac derived 
from the nucleus, 2) diplospory where the embryo develops from an unreduced egg in 

10 an embryo sac derived from the megaspore mother cell, and 3) adventitious embryony 
where the embryo develops directly from a somatic cell. In most forms of apomixis, 
pseudogamy or fertilization of the polar nuclei to produce endosperm is necessary for 
seed viability. In apospoiy, a nurse cultivar can be used as a pollen source for 
endosperm formation in seeds. The nurse cultivar does not affect the genetics of the 

15 aposporous apomictic cultivar since the unreduced egg of the cultivar develops 
parthenogenetically, but makes possible endosperm production. Apomixis is 
economically important, especially in transgenic plants, because it causes any 
genotype, no matter how heterozygous, to breed true. Thus,with apomictic 
reproduction, heterozygous transgenic plants can maintain their genetic fidelity 

2 0 throughout repeated life cycles. Methods for the production of apomictic plants are 
known in the art. See, U.S. Patent No.5,81 1,636, which is herein incorporated by 
reference in its entirety. 

The nucleic acid sequences of the present invention can be used in constructs 
to provide for the expression of the sequence in a variety of host cells, both 

2 5 prokaryotic eukaryotic. Host cells of the present invention preferably include 
monocotyledenous and dicotyledenous plant cells. 

In general, the skilled artisan is familiar with the standard resource materials 
which describe specific conditions and procedures for the construction, manipulation 
and isolation of macromolecules (e.g., DNA molecules, plasmids, etc.), generation of 
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recombinant organisms and the screening and isolating of clones, (see for example, 
Sambrook etal t Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Press (1989); Maliga et al, Methods in Plant Molecular Biology, Cold Spring Harbor 
Press (1995), the entirety of which is herein incorporated by reference; Birren et al, 

5 Genome Analysis: Analyzing DNA, 1 , Cold Spring Harbor, New York, the entirety of 
which is herein incorporated by reference). 

Methods for the expression of sequences in insect host cells are known in the 
art. Baculovirus expression vectors are recombinant insect viruses in which the coding 
sequence for a chosen foreign gene has been inserted behind a baculovirus promoter 

.0 in place of the viral gene, e.g., polyhedrin (Smith and Summers, U.S. Pat No., 
4,745,051, the entirety of which is incorporated herein by reference). Baculovirus 
expression vectors are known in the art, and are described for example in Doerfler, 
Curr. Top. Microbiol. Immunol. 757:51-68 (1968); Luckow and Summers, 
Bio/Technology 5:47-55 (1988a); Mitter, Annual Review of Microbiol. ¥2:177-199 

5 (1988); Summers, Curr. Comm. Molecular Biology ; Cold Spring Harbor Press, Cold 
Spring Harbor, N.Y. (1988); Summers and Smith, ^4 Manual of Methods for 
Baculovirus Vectors and Insect Cell Culture Procedures, Texas Ag. Exper. Station 
Bulletin No. 1555 (1988), the entireties of which is herein incorporated by reference) 
Methods for the expression of a nucleic acid sequence of interest in a fungal 

0 host cell are known in the art. The fungal host cell may, for example, be a yeast cell or 
a filamentous fungal cell. Methods for the expression of DNA sequences of interest in 
yeast cells are generally described in "Guide to yeast genetics and molecular biology", 
Guthrie and Fink, eds. Methods in enzymology , Academic Press, Inc. Vol 194 (1991) 
and Gene expression technology", Goeddel ed, Methods in Enzymology, Academic 

5 Press, Inc., Voll85 (1991). 

Mammalian cell lines available as hosts for expression are known in the art 
and include many immortalized cell lines available from the American Type Culture 
Collection (ATCC, Manassas, VA), such as HeLa cells, Chinese hamster ovary 
(CHO) cells, baby hamster kidney (BHK) cells and a number of other cell lines. 
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Suitable promoters for mammalian cells are also known in the art and include, but are 
not limited to, viral promoters such as that from Simian Virus 40 (SV40) (Fiers et al. 9 
Nature 273:1 13 (1978), the entirety of which is herein incorporated by reference), 
Rous sarcoma virus (RSV), adenovirus (ADV) and bovine papilloma virus (BPV). 
5 Mammalian cells may also require terminator sequences and poly-A addition 

sequences. Enhancer sequences which increase expression may also be included and 
sequences which promote amplification of the gene may also be desirable (for 
example methotrexate resistance genes). 

Vectors suitable for replication in mammalian cells are well known in the art, 

10 and may include viral replicons, or sequences which insure integration of the 

appropriate sequences encoding epitopes into the host genome. Plasmid vectors that 
greatly facilitate the construction of recombinant viruses have been described (see, for 
example, Mackett et al, J Virol 4P:857 (1984); Chakrabarti et al.,MoL Cell Biol. 
5:3403 (1985); Moss, In: Gene Tramfer Vectors For Mammalian Cells (Miller and 

15 Calos, eds., Cold Spring Harbor Laboratory, N.Y., p. 10, (1987); all of which are 
herein incorporated by reference in their entirety). 

The invention also includes plants and plant parts, such as seed, oil and meal 
derived from seed, and feed and food products processed from plants, which are 
enriched in tocopherols. Of particular interest is seed oil obtained from transgenic 

2 0 plants where the tocopherol level has been increased as compared to seed oil of a non- 
transgenic plant. 

Hie harvested plant material may be subjected to additional processing to 
further enrich the tocopherol content. The skilled artisan will recognize that there are 
many such processes or methods for refining, bleaching and degumming oil. United 
2 5 States Patent Number 5,932,261, issued August 3, 1999, discloses on such process, 
for the production of a natural carotene rich refined and deodorised oil by subjecting 
the oil to a pressure of less than 0.060 mbar and to a temperature of less than 
200.degree. C. Oil distilled by this process has reduced free fatty acids, yielding a 
refined, deodorised oil where Vitamin E contained in the feed oil is substantially 
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retained in the processed oil. The teachings of this patent are incorporated herein by 
reference. 



The invention now being generally described, it will be more readily 
5 understood by reference to the following examples which are included for purposes of 
illustration only and are not intended to limit the present invention. 



EXAMPLES 

Example 1: Identification of Prenyltransferase or tocopherol cyclase Sequences 

■ 

10 PSI-BLAST (Altschul, etal. (1997) NucAcidRes 25:3389-3402) profiles 

were generated for both the straight chain and aromatic classes of prenyltransferases. 
To generate the straight chain profile, a prenyl- transferase from Porphyra purpurea 
(Genbank accession 1709766) was used as a query against the NCBI non-redundant 
protein database. The E. coli enzyme involved in the formation of ubiquinone, ubiA 

1 5 (genbank accession 1 790473) was used as a starting sequence to generate the aromatic 
prenyltransferase profile. These profiles were used to search public and proprietary 
DNA and protein data bases. hxArabidopsis six putative prenyltransferases of the 
straight-chain class were identified, ATPT1, (SEQ ID NO:9), ATPT7 (SEQ ID 
NO:10), ATPT8 (SEQ ID NO:ll), ATPT9 (SEQ ID NO:13), ATPT10 (SEQ ID 

2 0 NO: 14), and ATPT1 1 (SEQ ID NO: 15), and six were identified of the aromatic class, 
ATPT2 (SEQ ID NO:l), ATPT3 (SEQ ID NO:3), ATPT4 (SEQ ID NO:5), ATPT5 
(SEQ ID NO:7), ATPT6 (SEQ ID NO:8), and ATPT12 (SEQ ID NO:16). Additional 
prenyltransferase sequences from other plants related to the aromatic class of 
prenyltransferases, such as soy (SEQ ID NOs: 19-23, the deduced amino acid 

2 5 sequence of SEQ ID NO:23 is provided in SEQ ID NO:24) and maize (SEQ ID 

NOs:25-29, and 31) are also identified. The deduced amino acid sequence of ZMPT5 
(SEQ ID NO:29) is provided in SEQ ID NO:30. 

Searches are performed on a Silicon Graphics Unix computer using additional 
Bioaccellerator hardware and Gen Web software supplied by Compugen Ltd. This 
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software and hardware enables the use of the Smith- Waterman algorithm in searching 
DNA and protein databases using profiles as queries. The program used to query 
protein databases is profilesearch. This is a search where the query is not a single 
sequence but a profile based on a multiple alignment of amino acid or nucleic acid 
5 sequences. The profile is used to query a sequence data set, i.e., a sequence database. 
The profile contains all the pertinent information for scoring each position in a 
sequence, in effect replacing the "scoring matrix" used for the standard query 
searches. The program used to query nucleotide databases with a protein profile is 
tprofilesearch. Tprofilesearch searches nucleic acid databases using an amino acid 

10 profile query. As the search is running, sequences in the database are translated to 
amino acid sequences in six reading fiames. The output file for tprofilesearch is 
identical to the output file for profilesearch except for an additional column that 
indicates the frame in which the best alignment occurred. 

The Smith-Waterman algorithm, (Smith and Waterman (1981) supra), is used 

15 to search for similarities between one sequence from the query and a group of 
sequences contained in the database. E score values as well as other sequence 
information, such as conserved peptide sequences are used to identify related 
sequences. 

To obtain the entire coding region corresponding to the Arabidopsis 
2 0 prenyltransferase sequences, synthetic oligo-nucleotide primers are designed to 
amplify the 5' and 3' ends of partial cDNA clones containing prenyltransferase 
sequences. Primers are designed according to the respective Arabidopsis 
prenyltransferase sequences and used in Rapid Amplification of cDNA Ends (RACE) 
reactions (Frohman et al (1988) Proc. Natl. Acad Sci. USA 85:8998-9002) using the 
2 5 Marathon cDNA amplification kit (Clontech Laboratories Inc, Palo Alto, CA). 

Amino acid sequence alignments between ATPT2 (SEQ ID NO:2), ATPT3 
(SEQ ID NO:4), ATPT4 (SEQ ID NO:6), ATPT8 (SEQ ID NO:12), and ATPT12 
(SEQ ID NO: 17) are performed using ClustalW (Figure 1), and the percent identity 
and similarities are provided in Table 1 below. 
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Table 1: 





ATPT2 ATPT3 


ATPT4 


ATPT8 


ATPT12 


ATPT2 % Identity 


12 


13 


11 


15 


% similar 


25 


25 


22 


32 


% Gap 


17 


20 


20 


9 


ATPT3 % Identity 




12 


6 


22 


% similar 




29 


16 


38 


%Gap 




20 


24 


14 


ATPT4 % Identity 






9 


14 


% similar 






18 


29 


%Gap 






26 


19 


ATPT8 % Identity 








7 


% similar 


• 






19 


%Gap 








20 


ATPT12 % Identity 










% similar 










% Gap 











Example 2: Preparation of Prenyl Transferase Expression Constructs 
5 A plasmid containing the napin cassette derived from pCGN3223 (described 

in USPN 5,639,790, the entirety of which is incorporated herein by reference) was 
modified to make it more useful for cloning large DNA fragments containing multiple 
restriction sites, and to allow the cloning of multiple napin fusion genes into plant 
binary transformation vectors. An adapter comprised of the self annealed 
1 0 oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCAT 
TTAAAT (SEQ ID NO:40) was ligated into the cloning vector pBC SK+ (Stratagene) 
after digestion with the restriction endonuclease BssHII to construct vector 
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pCGN7765. Plamids pCGN3223 and pCGN7765 were digested with NotI and 
ligated together. The resultant vector, pCGN7770, contains the pCGN7765 backbone 
with the napin seed specific expression cassette from pCGN3223. 

* 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 
5 pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have 
been replaced with the double CAMV 35S promoter and the tml polyadenylation and 
transcriptional termination region. 

A binary vector for plant transformation, pCGN5139, was constructed from 
pCGN1558 (McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). 
10 The polylinker of pCGN1558 was replaced as a Hindm/Asp718 fragment with a 
polylinker containing unique restriction endonuclease sites, AscI, PacI, Xbal, Swal, 
BamHI, and NotI. The Asp718 and Hindin restriction endonuclease sites are retained 
inpCGN5139. 

A series of turbo binary vectors are constructed to allow for the rapid cloning 
15 of DNA sequences into binary vectors containing transcriptional initiation regions 
(promoters) and transcriptional termination regions. 

The plasmid pCGN8618 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3 ' (SEQ ID NO:41) and 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC-3 ' (SEQ ID NO:42) into 
2 0 Sall/XhoI-digested pCGN7770. A fragment containing the napin promoter, 
polylinker and napin 3' region was excised from pCGN861 8 by digestion with 
Asp71 81; the fragment was blunt-ended by filling in the 5' overhangs with Klenow 
fragment then ligated into pCGN5139 that had been digested with Asp71 81 and 

9m 

Hindlll and blunt-ended by filling in the 5' overhangs with Klenow fragment. A 
2 5 plasmid containing the insert oriented so that the napin promoter was closest to the 
blunted Asp7 1 81 site of pCGN5 1 39 and the napin 3 ' was closest to the blunted 
HindlU site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated 
pCGN8622. 
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The plasmid pCGN8619 was constructed by ligating oligonucleotides 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' (SEQ ID NO:43) and 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3 * (SEQ ID NO:44) into 
Sall/XhoI-digested pCGN7770. A fragment containing the napin promoter, 
5 polylinker and napin 3' region was removed from pCGN861 9 by digestion with 
Asp718I; the fragment was blunt-ended by filling in the 5' overhangs with Klenow 
fragment then ligated into pCGN5139 that had been digested with Asp718I and 
Hindm and blunt-ended by filling in the 5' overhangs with Klenow fragment A 
plasmid containing the insert oriented so that the napin promoter was closest to the 
1 0 blunted Asp7 1 81 site of pCGN5 1 3 9 and the napin 3 ' was closest to the blunted 

Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated 
pCGN8623. 

The plasmid pCGN8620 was constructed by ligating oligonucleotides 5'- 
15 TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3 ' (SEQ ID NO:45) 
and 5'-CCTGCAGGAAGCTTGCGGCCGCGGATCC-3' (SEQ ID NO:46) into 
Sall/SacI-digested pCGN7787. A fragment containing the d35S promoter, polylinker 
and tml 3' region was removed from pCGN8620 by complete digestion with Asp71 81 
and partial digestion with Notl. The fragment was blunt-ended by filling in the 5* 
2 0 overhangs with Klenow fragment then ligated into pCGN5 1 39 that had been digested 
with Asp71-8I and Hindlll and blunt-ended by filling in the 5' overhangs with Klenow 
fragment. A plasmid containing the insert oriented so that the d35S promoter was 
closest to the blunted Asp718I site of pCGN5139 and the tml 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert 
2 5 orientation and the integrity of cloning junctions. The resulting plasmid was 
designated pCGN8624. 

The plasmid pCGN8621 was constructed by ligating oligonucleotides 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' (SEQ ID NO:47) 
and 5 '-GGATCCGCGGCCGC AAGCTTCCTGCAGG-3 ' (SEQ ID NO:48) into 
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Sall/SacI-digested pCGN7787. A fragment containing the d35S promoter, polylinker 
and tml 3' region was removed from pCGN8621 by complete digestion with Asp718I 
and partial digestion with NotL The fragment was blunt-ended by filling in the 5' 
overhangs with Klenow fragment then ligated into pCGN5 139 that had been digested 
5 with Asp7 181 and Hindm and blunt-ended by filling in the 5 9 overhangs with Klenow 
fragment. A plasmid containing the insert oriented so that the d35S promoter was 
closest to the blunted Asp718I site of pCGN5139 and the tml 3' was closest to the 
blunted Hindin site was subjected to sequence analysis to confirm both the insert 
orientation and the integrity of cloning junctions. The resulting plasmid was 

10 designated pCGN8625. 

The plasmid construct pCGN8640 is a modification of pCGN8624 described 
above. A 938bp Psfl fragment isolated from transposon Tn7 which encodes bacterial 
spectinomycin and streptomycin resistance (Fling et al. (1985), Nucleic Acids 
Research 13(19):7095-7106), a determinant for E. coli and Agrobacterium selection, 

1 5 was blunt ended with Pfii polymerase. The blunt ended fragment was ligated into 
pCGN8624 that had been digested with Spel and blunt ended with Pfu polymerase. 
The region containing the PstI fragment was sequenced to confirm both the insert 
orientation and the integrity of cloning junctions. 

The spectinomycin resistance marker was introduced into pCGN8622 and 

2 0 pCGN8623 as follows. A 7.7 Kbp Avrll-SnaBI fiagment from pCGN8640 was 
ligated to a 10.9 Kbp Avrll-SnaBI fragment from pCGN8623 or pCGN8622, 
described above. The resulting plasmids were pCGN8641 and pCGN8643, 
respectively. 

The plasmid pCGN8644 was constructed by ligating oligonucleotides 5'- 
25 GATCACCTGC AGGAAGCTTGCGGCCGCGG ATCCAATGCA-3 ' (SEQ ID 
NO:49) and 5'- TTGGATCCGCGGCCGCAAGCTTCCTGCAGGT-3 ' (SEQ ID 
' NO:50) into BamHI-PstI digested pCGN8640. 
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Synthetic oligonulceotides were designed for use in Polymerase Chain Reactions 
(PCR) to amplify the coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and 
ATPT12 for the preparation of expression constructs and are provided in Table 2 below. 

5 Table 2: 



Name 


Restriction 


Sequence 


SEQ 




Site 




ID INvJ. 


ATPT2 


5* Noil 


GGATCCGCGGCCGCACAATGGAGTC 


51 






TCTGCTCTCTAGTTCT 




ATPT2 


3' Ssel 


GGATCCTGCAGGTCACTTCAAAAAA 


52 






GGTAACAGCAAGT 




ATPT3 


5' NotI 


GGATCCGCGGCCGCACAATGGCGTT 


53 






TTTTGGGCTCTCCCGTGTTT 




ATPT3 


3' Ssel 


GGATCCTGCAGGTTATTGAAAACTT 


54 






CTTCCAAGTACAACT 




ATPT4 


5' NotI 


GGATCCGCGGCCGCACAATGTGGCG 


55 






AAGATCTGTTGTT 




ATPT4 


3 y Ssel 


GGATCCTGCAGGTCATGGAGAGTAG 


56 






AAGGAAGGAGCT 




ATPT8 


5' NotI 


GGATCCGCGGCCGCACAATGGTACT 


57 






TGCCGAGGTTCCAAAGCTTGCCTCT 




ATPT8 


3 ' Ssel 


GGATCCTGCAGGTCACTTGTTTCTGG 


58 






TGATGACTCTAT 




ATPT12 


5' NotI 


GGATCCGCGGCCGCACAATGACTTC 


59 






GATTCTCAACACT 




ATPT12 


3' Ssel 


GGATCCTGCAGGTCAGTGTTGCGAT 


60 






GCTAATGCCGT 





The coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and ATPT12 were all 
amplified using the respective PCR primers shown in Table 2 above and cloned into the 
TopoTA vector (Invitrogen). Constructs containing the respective prenyltransferase 
1 0 sequences were digested with NotI and Sse8387I and cloned into the turbobinary vectors 
described above. 

The sequence encoding ATPT2 prenyltransferase was cloned in the sense 
orientation into pCGN8640 to produce the plant transformation construct pCGN10800 
(Figure 2). The ATPT2 sequence is under control of the 35S promoter. 
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The ATPT2 sequence was also cloned in the antisense orientation into the 
construct pCGN8641 to create pCGN10801 (Figure 3). This construct provides for the 
antisense expression of the ATPT2 sequence from the napin promoter. 

The ATPT2 coding sequence was also cloned in the sense orientation into the 
5 vector pCGN8643 to create the plant transformation construct pCGNl 0822 

The ATPT2 coding sequence was also cloned in the antisense orientation into the 
vector pCGN8644 to create the plant transformation construct pCGN10803 (Figure 4). 

The ATPT4 coding sequence was cloned into the vector pCGN864 to create the 
plant transformation construct pCGN10806 (Figure 5). The ATPT2 coding sequence was 
1 0 cloned into the vector TopoTA ™ vector from Invitrogen, to create the plant 

transformation construct pCGN10807(Figure 6). The ATPT3 coding sequence was cloned 
into the TopoTA vector to create the plant transformation construct pCGN10808 (Figure 
7). The ATPT3 coding sequence was cloned in the sense orientation into the vector 
pCGN8640 to create the plant transformation construct pCGN10809 (Figure 8). The 
15 ATPT3 coding sequence was cloned in the antisense orientation into the vector 

pCGN8641 to create the plant transformation construct pCGN10810 (Figure 9). The 
ATPT3 coding sequence was cloned into the yector pCGN8643 to create the plant 
transformation construct pCGN1081 1 (Figure 10). The ATPT3 coding sequence was 
cloned into the vector pCGN8644 to create the plant transformation construct 
2 0 pCGN10812 (Figure 1 1). The ATPT4 coding sequence was cloned into the vector 
pCGN8640 to create the plant transformation construct pCGN10813 (Figure 12). The 
ATPT4 coding sequence was cloned into the vector pCGN8641 to create the plant 
transformation construct pCGN10814 (Figure 13). The ATPT4 coding sequence was 
cloned into the vector pCGN8643 to create the plant transformation construct 
25 pCGN10815 (Figure 14). The ATPT4 coding sequence was cloned in the antisense 
orientation into the vector pCGN8644 to create the plant transformation construct 
pCGN10816 (Figure 15). The ATPT8 coding sequence was cloned in the sense 
orientation into the vector pCGN8643 to create the plant transformation construct 
pCGN10819 (Figure 17). The ATPT12 coding sequence was cloned into the vector 
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pCGN8640 to create the plant transformation construct pCGN10824 (Figure 18). The 
ATPT12 coding sequence was cloned into the vector pCGN8643 to create the plant 
transformation construct pCGN10825 (Figure 19). The ATPT8 coding sequence was 
cloned into the vector pCGN8640 to create the plant transformation construct 
5 pCGN10826 (Figure 20). 

Example 3: Plant Transformation with Prenyl Transferase Constructs 

Transgenic Brassica plants are obtained by Agrobacterium-mediaied 

transformation as described by Radke etal (Theor. Appl Genet. (1988) 75:685-694; 
10 Plant Cell Reports (1992) 77:499-505). Transgenic Arabidopsis thaliana plants may 

be obtained by Agrobacterium-mediated transformation as described by Valverkens et 

ai, (Proa Nat. Acad Set (1988) 55:5536-5540), or as described by Bent et al. 

((1994), Science 265:1856-1860), or Bechtold et al. ((1993), C.R.AcadSci t Life 

Sciences 316:11 94-1 1 99). Other plant species may be similarly transformed using 
1 5 related techniques. 

Alternatively, microprojectile bombardment methods, such as described by 

Klein et al (Bio/T echnology 70:286-291) may also be used to obtain nuclear 

transformed plants. 

2 0 Example 4: Identification of Additional Prenyltransferases 

Additional BLAST searches were performed using the ATPT2 sequence, a 
sequence in the class of aromatic prenyltransferases. ESTs, and in some case, full- 
length coding regions, were identified in proprietary DNA libraries. 

Soy full-length homologs to ATPT2 were identified by a combination of 
25 BLAST (using ATPT2 protein sequence) and 5' RACE. Two homologs resulted 

(SEQ ID NO:95 and SEQ ID NO:96). Translated amino acid sequences are provided 
by SEQ ID NO:97 and SEQ ID NO:98. 

A rice est ATPT2 homolog is shown in SEQ ID NO:99 (obtained from 
BLAST using the wheat ATPT2 homolog). 
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Other homolog sequences were obtained using ATPT2 and PSI-BLAST, 
including est sequences from wheat (SEQ ID NO: 100), leek (SEQ ID NOs: 101 and 
102), canola (SEQ ID NO:103), com (SEQ ID NOs: 104, 105 and 106), cotton (SEQ 
ID NO:107) and tomato (SEQ ID NO:108). 
5 A PSI-Blast profile generated using the E coli ubiA (genbank accession 

1 790473) sequence was used to analyze the Synechocystis genome. This analysis 
identified 5 open reading frames (ORFs) in the Synechocystis genome that were 
potentially prenyltransferases; slr0926 (annotated as ubiA (4-hydroxybenzoate- 
octaprenyltransferase, SEQ IDNO:32), sill 899 (annotated as ctaB (cytocrome c 
10 oxidase folding protein, SEQ ID NO:33), slr0056 (annotated as g4 (chlorophyll 

synthase 33 kd subunit, SEQ ID NO:34), slrl518 (annotated as menA (menaquinone 
biosynthesis protein, SEQ ID NO:35), and slrl736 (annotated as a hypothetical 
protein of unknown function (SEQ ID NO:36). 

15 4 A. Synechocystis Knock-outs 

To determine the functionality of these ORFs and their involvement, if any, in 
the biosynthesis of tocopherols, knockouts constructs were made to disrupt the ORF 
identified in Synechocystis. 

Synthetic oligos were designed to amplify regions from the 5' (5*- 
20 TAATGTGTACATTGTCGGCCTC (17365*) (SEQ ID NO:61) and 5'- 

GCAATGTAACATCAGAGATTTTGAGACACAACGTGGCTTTCCACAATTCC 
CCGCACCGTC (1736kanprl)) (SEQ ID NO:62) and 3* (5'- 

AGGCTAATAAGCACAAATGGGA (17363') (SEQ ID NO:63) and 5'- 
GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGC 
25 GGAATTGGTTTAGGTTATCCC (1736kanpr2)) (SEQ ID NO:64) ends of the 

slrl736 ORF. The 1736kanprl and 1736kanpr2 oligos contained 20 bp of homology 
to the slrl736 ORF with an additional 40 bp of sequence homology to the ends of the 
kanamycin resistance cassette. Separate PCR steps were completed with these oligos 
and the products were gel purified and combined with the kanamycin resistance gene 
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from puc4K (Pharmacia) that had been digested with HincU and gel purified away 
from the vector backbone. The combined fragments were allowed to assemble 
without oligos under the following conditions: 94°C for 1 min, 55°C for 1 min, 72°C 
for 1 min plus 5 seconds per cycle for 40 cycles using pfu polymerase in lOOul 
5 reaction volume (Zhao, H and Arnold (1997) Nucleic Acids Res. 25(6):1307-1308). 
One microliter or five microliters of this assembly reaction was then amplified using 
5' and 3' oligos nested within the ends of the ORF fragment, so that the resulting 
product contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked 
out, the kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be 
1 0 knocked out. This PGR product was then cloned into the vector pGemT easy 
(Promega) to create the construct pMON21681 and used for Synechocystis 
transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
constructs for the other sequences using the same method as described above, with the 
15 following primers. The ubiA 5' sequence was amplified using the primers 5'- 
GGATCCATGGTT GCCCAAACCCCATC (SEQ ID NO:65) and 5'- 
GCAATGTAACATCAGAGA TTTTGAGACACAACG 

TGGCTTTGGGTAAGCAACAATGACCGGC (SEQ ID NO:66). The 3' region was 
amplified using the synthetic oligonucleotide primers 5'- 

2 0 GAATTCTCAAAGCCAGCCCAGTAAC (SEQ ID NO:67) and 5'-GGTATGAGTC 
AGCAACACCTTCTTCACGAGGCAGACCTCAGCGGGTGCGAAAAGGGTTTT 
CCC (SEQ ID NO:68). The amplification products were combined with the 
kanamycin resistance gene from puc4K (Pharmacia) that had been digested with 
HincU and gel purified away from the vector backbone. The annealed fragment was 

2 5 amplified using 5' and 3' oligos nested within the ends of the ORF fragment (5 5 - 
CCAGTGGTTTAGGCTGTGTGGTC (SEQ ID NO:69) and 5'- 
CTGAGTTGGATGTATTGGATC (SEQ ID NO:70)), so that the resulting product 
contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the 
kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked 
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out. This PGR product was then cloned into the vector pGemT easy (Promega) to 
create the construct pMON21682 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
constructs for the other sequences using the same method as described above, with the 
5 following primers. The sll 1 899 5' sequence was amplified using the primers 5'- 
GGATCCATGGTTACTT CGACAAAAATCC (SEQ ID NO:71) and 5'- 
GCAATGTAACATCAGAG 

ATmGAGACACAACGTGGCTTTGCTAGGCAACCGCTTAGTAC (SEQ ID 
NO:72). The 3' region was amplified using the synthetic oligonucleotide primers 5'- 
10 GAATTCTTAACCCAACAGTAAAGTTCCC (SEQ ID NO:73) and 5'- 
GGTATGAGTCAGC 

AACACCTTCTTCACGAGGCAGACCTCAGCGCCGGCATTGTCTITTACATG 
(SEQ ID NO:74). The amplification products were combined with the kanamycin 
resistance gene from puc4K (Pharmacia) that had been digested with HincTL and gel 
1 5 purified away from the vector backbone. The annealed fragment was amplified using 
5' and 3' oligos nested within the ends of the ORF fragment (5'- 
GGAACCCTTGCAGCCGCTTC (SEQ IDNO:75) 

and 5'- GTATGCCCAACTGGTGCAGAGG (SEQ ID NO:76)), so that the resulting 
product contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked 
2 0 out, the kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be 
knocked out. This PCR product was then cloned into the vector pGemT easy 
(Promega) to create the construct pMON21679 and used for Synechocystis 
transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
2 5 constructs for the other sequences using the same method as described above, with the 
following primers. The slr0056 5' sequence was amplified using the primers 5'- 
GGATCCATGTCTGACACACAAAATACCG (SEQ ED NO:77) and 5'- 

GCAATGTAACATCAGAGATTTTGAGACACAACGTGGCTTTCGCCAATACC 
AGCCACCAACAG (SEQ ID NO:78). The 3' region was amplified using the 
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synthetic oligonucleotide primers 5'- GAATTCTCAAAT CCCCGCATGGCCTAG 
(SEQ ID NO:79) and 5'- 

GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGCGGCCTACG 
GCTTGGACGTGTGGG (SEQ ID NO:80). The amplification products were 
5 combined with the kanamycin resistance gene from puc4K (Pharmacia) that had been 
digested with HincE and gel purified away from the vector backbone. The annealed 
fragment was amplified using 5' and 3' oligos nested within the ends of the ORF 
fragment (5'- CACTTGGATTCCCCTGATCTG (SEQ ID NO:81) and 5'- 
GCAATACCCGCTTGGAAAACG (SEQ ID NO:82)), so that the resulting product 

10 contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the 
kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked 
out. This PCR product was then cloned into the vector pGemT easy (Promega) to 
create the construct pMON21677 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 

15 constructs for the other sequences using the same method as described above, with the 
following primers. The slrl518 5' sequence was amplified using the primers 5'- 
GGATCCATGACCGAAT CTTCGCCCCTAGC (SEQ ID NO:83) and 5 
GCAATGTAACATCAGAGATTTTGA GACACAACGTGGC 
TTTCAATCCTAGGTAGCCGAGGCG (SEQ ID NO:84). The 3' region was 

2 0 amplified using the synthetic oligonucleotide primers 5'- 

GAATTCTTAGCCCAGGCC AGCCCAGCC (SEQ ID NO:85)and 5'- 
GGTATGAGTCAGCAACACCTTCTTCACGA 

GGCAGACCTCAGCGGGGAATTGATTTGTTTAATTACC (SEQ ID NO:86). The 
amplification products were combined with the kanamycin resistance gene from 
25 puc4K (Pharmacia) that had been digested with HincU and gel purified away from the 
vector backbone. The annealed fragment was amplified using 5' and 3' oligos nested 
within the ends of the ORF fragment (5'- GCGATCGCCATTATCGCTTGG (SEQ ID 
NO:87) and 5'- GCAGACTGGCAATTATCAGTAACG (SEQ ID NO:88)), so that 
the resulting product contained 1 00-200 bp of the 5' end of the Synechocystis gene to 
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be knocked out, the kanamycin resistance cassette, and 100-200 bp of the 3' end of the 
gene to be knocked out This PCR product was then cloned into the vector pGemT 
easy (Promega) to create the construct pMON21680 and used for Synechocystis 
transformation. 

* 

4B. Transformation of Synechocystis 

Cells of Synechocystis 6803 were grown to a density of approximately 2x1 0 8 
cells per ml and harvested by centrifugaiion. The cell pellet was re-suspended in fresh 
BG-1 1 medium (ATCC Medium 616) at a density of lxlO 9 cells per ml and used 
immediately for transformation. One-hundred microliters of these cells were mixed 
with 5 ul of mini prep DNA and incubated with light at 30C for 4 hours. This mixture 
was then plated onto nylon filters resting on BG- 1 1 agar supplemented with TES pH8 
and allowed to grow for 12-18 hours. The filters were then transferred to BG-1 1 agar 
+ TES + 5ug/ml kanamycin and allowed to grow until colonies appeared within 7-10 
days (Packer and Glazer, 1988). Colonies were then picked into BG-1 1 liquid media 
containing 5 ug/ml kanamycin and allowed to grow for 5 days. These cells were then 
transferred to Bg-1 1 media containing lOug/ml kanamycin and allowed to grow for 5 
days and then transferred to Bg-1 1 + kanamycin at 25ug/ml and allowed to grow for 5 
days. Cells were then harvested for PCR analysis to determine the presence of a 
disrupted ORF and also for HPLC analysis to determine if the disruption had any 
effect on tocopherol levels. 

PCR analysis of the Synechocystis isolates for sir 1736 and sU1899 showed 
complete segregation of the mutant genome, meaning no copies of the wild type 
genome could be detected in these strains. This suggests that function of the native 
gene is not essential for cell function. HPLC analysis of these same isolates showed 
that the sill 899 strain had no detectable reduction in tocopherol levels. However, the 
strain carrying the knockout for slrl736 produced no detectable levels of tocopherol. 

The amino acid sequences for the Synechocystis knockouts are compared using 
ClustalW, and are provided in Table 3 below. Provided are the percent identities, 
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percent similarity, and the percent gap. The alignment of the sequences is provided in 
Figure 21. 



Table 3: 





Slrl736 slr0926 


sM899 


slr0056 


slrl518 


slrl736 %identity 


14 


12 


18 


11 


%similar 


29 


30 


34 


26 


%gap 


8 


7 


10 


5 


slr0926 %identity 




20 


19 


14 


%similar 




39 


32 


28 


%gap 




7 


9 


4 


sill 899 %identity 






17 


13 


%similar 






29 


29 


%gap 






12 


9 


slr0056 %identity 








15 


%similar 








31 


%gap 








8 


slrl518%identity 


• 








%similar 










< 

%gap 











Amino acid sequence comparisons are performed using various Arabidopsis 
prenyltransferase sequences and the Synechocystis sequences. The comparisons are 
presented in Table 4 below. Provided are the percent identities, percent similarity, 
and the percent gap. The alignment of the sequences is provided in Figure 22. 
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4C. Phytyl Prenyltransferase Enzyme Assays 

[ 3 H] Homogentisic acid in 0.1% H 3 P0 4 (specific radioactivity 40 Ci/mmol). 
Phytyl pyrophosphate was synthesized as described by Joo, et al (1973) Can J. 
5 Biochem. 51:1527. 2-methy l-6-phytylquinol and 2,3 -dimethy 1-5 -phytylquinol were 
synthesized as described by Soil, et al (1980) Phytochemistry 19:215. Homogentisic 
acid, a, |3, 5, and y-tocopherol, and tocol, were purchased commercially. 

The wild-type strain of Synechocystis sp. PCC 6803 was grown in BG1 1 medium 
with bubbling air at 30°C under 50 jiE.m" 2 .s" 1 fluorescent light, and 70% relative 

1 0 humidity. The growth medium of slrl 736 knock-out (potential PPT) strain of this 
organism was supplemented with 25 jig ml/ 1 kanamycin. Cells were collected from 
0.25 to 1 liter culture by centrifugation at 5000 g for 10 min and stored at -80°C. 

Total membranes were isolated according to Zak's procedures with some 
modifications (Zak, et al. (1999) Eur J. Biochem 261 :3 1 1). Cells were broken on a 

1 5 French press. Before the French press treatment, the cells were incubated for 1 hour 
with lysozyme (0.5%, w/v) at 30 °C in a medium containing 7 mM EDTA, 5 mM NaCl 
and 10 mM Hepes-NaOH, pH 7.4. The spheroplasts were collected by centrifugation at 
5000 g for 10 min and resuspendedat 0.1 - 0.5 mg chlorophyll mL" 1 in 20 mM 
potassium phosphate buffer, pH 7.8. Proper amount of protease inhibitor cocktail and 

2 0 DNAase I from Boehringer Mannheim were added to the solution. French press 
treatments were performed two to three times at 100 MPa. After breakage, the cell 
suspension was centrifugedfor 10 min at 5000g to pellet unbroken cells, and this was 
followed by centrifugation at 100 000 g for 1 hour to collect total membranes. The final 
pellet was resuspended in a buffer containing 50 mM Tris-HCL and 4 mM MgCl 2 . 

2 5 Chloroplast pellets were isolated from 250 g of spinach leaves obtained from local 
markets. Devined leaf sections were cut into grinding buffer (2 1 /250 g leaves) 
containing 2 mM EDTA, 1 mM MgCl 2 , 1 mM MnCl 2 , 0.33 M sorbitol, 0.1% ascorbic 
acid, and 50 mM Hepes at pH 7.5. The leaves were homogenized for 3 sec three times 
in a 1-L blendor, and filtered through 4 layers of mirocloth. The supernatant was then 
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centrifuged at SOOQg for 6 min. The chloroplast pellets were resuspended in small 
amount of grinding buffer (Douce,ef al Methods in Chloroplast Molecular Biology, 
239 (1982) 

Chloroplasts in pellets can be broken in three ways. Chloroplast pellets were first 
5 aliquoted in 1 mg of chlorophyll per tube, centrifuged at 6000 ipm for 2 min in 

microcentrifuge, and grinding buffer was removed. Two hundred microliters of Triton 
X-100 buffer (0.1% Triton X-100, 50 mM Tris-HCl pH 7.6 and 4 mM MgCy or 
swelling buffer (10 mM Tris pH 7.6 and 4 mM MgCy was added to each tube and 
incubated for V% hour at 4°C. Then the broken chloroplast pellets were used for the 

1 0 assay immediately. In addition, broken chloroplasts can also be obtained by freezing 
in liquid nitrogen and stored at -80°C for Vi hour, then used for the assay. 

In some cases chloroplast pellets were further purified with 40%/ 80% percoll 
gradient to obtain intact chloroplasts. The intact chloroplasts were broken with 
swelling buffer, then either used for assay or further purified for envelope membranes 

15 with 20.5%/ 31.8% sucrose density gradient (Sol, et al (1980) supra). The membrane 
fractions were centrifuged at 100 OOOg for 40 min and resuspended in 50 mM Tris-HCl 
pH 7.6, 4 mM MgCl 2 . 

Various amounts of [ 3 H]HGA, 40 to 60 jaM unlabelled HGA with specific activity 
in the range of 0. 1 6 to 4 Ci/mmole were mixed with a proper amount of 1 M Tris- 

2 0 NaOH pH 10 to adjust pH to 7.6. HGA was reduced for 4 min with a trace amount of 
solid NaBHj. In addition to HGA, standard incubation mixture (final vol 1 mL) 
contained 50 mM Tris-HCl, pH 7.6, 3-5 mM MgCl 2 , and 100 \M phytyl 
pyrophosphate. The reaction was initiated by addition of Synechocystis total 
membranes, spinach chloroplast pellets, spinach broken chloroplasts, or spinach 

2 5 envelope membranes. The enzyme reaction was carried out for 2 hour at 23°C or 30°C 
in the dark or light. The reaction is stopped by freezing with liquid nitrogen, and 
stored at -80°C or directly by extraction. 

A constant amount of tocol was added to each assay mixture and reaction products 
were extracted with a 2 mL mixture of chloroform/methanol (1 :2, v/v) to give a 
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monophasic solution. NaCl solution (2 mL; 0.9%) was added with vigorous shaking. 
This extraction procedure was repeated three times. The organic layer containing the 

• » 

prenylquinones was filtered through a 20 m\i filter, evaporated under and then 
resuspended in 1 00 [iL of ethanol. 
5 The samples were mainly analyzed by Normal-Phase HPLC method (Isocratic 
90% Hexane and 1 0% Methyl-t-butyl ether), and use a Zorbax silica column, 4.6 x 250 
mm. The samples were also analyzed by Reversed-Phase HPLC method (Isocratic 
0.1% H 3 P0 4 in MeOH), and use a Vydac 201HS54 C18 column; 4.6 x 250 mm 
coupled with an All-tech CI 8 guard column. The amount of products were calculated 
10 based on the substrate specific radioactivity, and adjusted according to the % recovery 
based on the amount of internal standard. 

The amount of chlorophyll was determined as described in Anion (1 949) Plant 
Physiol 24: 1 . Amount of protein was determined by the Bradford method using 
gamma globulin as a standard (Bradford, (1976) Anal. Biochem. 72:248) 
1 5 Results of the assay demonstrate that 2-Methyl-6-Phytylplastoquinone is not 

produced in the Synechocystis slrl736 knockout preparations. The results of the 
phytyl prenyltransferase enzyme activity assay for the slrl736 knock out are presented 
in Figure 23. 

2 0 4D. Complementation of the slrl736 knockout with ATPT2 

In order to determine whether ATPT2 could complement the knockout of 
sir 1736 in Synechocystis 6803 a plasmid was constructed to express the ATPT2 
sequence from the TAC promoter. A vector, plasmid psll21 1, was obtained from the 
lab of Dr. Himadri Pakrasi of Washington University, and is based on the plasmid 

2 5 RSF1 0 1 0 which is a broad host range plasmid (Ng W.-O., Zentella R., Wang, Y., 
Taylor J-S. A., Pakrasi, H.B. 2000. phrA, the major photoreactivating factor in the 
cyanobacterium Synechocystis sp. strain PCC 6803 codes for a cyclobutane 
pyrimidine dimer specific DNA photolyase. Arch. Microbiol, (in press)). The ATPT2 
gene was isolated from the vector pCGN10817 by PCR using the following primers. 
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ATPT2nco.pr 5 5 -CC ATGGATTCG AGTAAAGTTGTCGC (SEQ ID NO:89); 

ATPT2ri.pr- 5 1 -GAATTC ACTTC AAAAAAGGTAAC AG (SEQ ID NO:90). These 
primers will 

remove approximately 112 BP from the 5' end of the ATPT2 sequence, 
which is thought to be the chloroplast transit peptide. These primers will also add an 
5 Ncol site at the 5' end and an EcoRI site at the 3' end which can be used for sub- 
cloning into subsequent vectors. The PCR product from using these primers and 
pCGN10817 was ligated into pGEM T easy and the resulting vector pMON21689 was 
confirmed by sequencing using the ml3forward and ml3reverse primers. The 
NcoI/EcoRI fragment from pMON21689 was then ligated with the Eagl/EcoRI and . 

1 0 Eagl/Ncol fragments from psl 1 2 1 1 resulting in pMON2 1 690. The plasmid 

pMON21690 was introduced into the slrl736 Synechocystis 6803 KO strain via 
conjugation. Cells of sl906 (a helper strain) and DH10B cells containing 
pMON21690 were grown to log phase (O.D. 600= 0.4) and 1 ml was harvested by 
centrifugation. The cell pellets were washed twice with a sterile BG-1 1 solution and 

15 resuspended in 200 ul of BG-1 1 . The following was mixed in a sterile eppendorf 

tube: 50 ul SL906, 50 ul DH10B cells containing pMON21690, and 100 ul of a fresh 
culture of the slrl736 Synechocystis 6803 KO strain (O.D. 730 = 0.2-0.4). The cell 
mixture was immediately transferred to a nitrocellulose filter resting on BG-1 1 and 
incubated for 24 hours at 30C and 2500 LUX(50 ue) of light. The filter was then 

20 transferred to BG-1 1 supplemented with lOug/ml Gentamycin and incubated as above 
for ~5 days. When colonies appeared, they were picked and grown up in liquid BG- 
11 + Gentamycin 10 ug/ml. (Elhai, J. and Wolk, P. 1988. Conjugal transfer of DNA 
to Cyanobacteria. Methods in Enzymology 167 , 747-54) The liquid cultures were then 
assayed for tocopherols by harvesting 1ml of culture by centrifugation, extracting with 

2 5 ethanol/pyrogallol, and HPLC separation. The slrl736 Synechocystis 6803 KO strain, 
did not contain any detectable tocopherols,- while the slrl736 Synechocystis 6803 KO 
strain transformed with pmon21690 contained detectable alpha tocopherol. A 
Synechocystis 6803 strain transformed with psll21 1 (vector control) produced alpha 
tocopherol as well. 
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4E: Additional Evidence of Prenyltransferase Activity 

To test the hypothesis that slrl736 or ATPT2 are sufficient as single genes to 
obtain phytyl prenyltransferase activity, both genes were expressed in SF9 cells and in 
5 yeast. When either slrl736 or ATPT2 were expressed in insect cells (Table 5) or in 
yeast, phytyl prenyltransferase activity was detectable in membrane preparations-, 
whereas membrane preparations of the yeast vector control, or membrane preparations 
of insect cells did not exhibit phytyl prenyltransferase activity. 

10 Table 5: Phytyl prenyltransferase activity 



Enzyme source 


Enzyme activity 
[pmol/mg x h] 


sir 173 6 expressed in SF9 cells 


20 


ATPT2 expressed in SF9 cells 


6 


SF9 cell control 


<0.05 


Synechocystis 6803 


0.25 


Spinach chloroplasts 


0.20 



Example 5: Transgenic Plant Analysis 

* 

15 5 A. Arabidopsis 

Arabidopsis plants transformed with constructs for the sense or antisense 
expression of the ATPT proteins were analyzed by High Pressure Liquid 
Chromatography (HPLC) for altered levels of total tocopherols, as well as altered 
levels of specific tocopherols (alpha, beta, gamma, and delta tocopherol). 

2 0 Extracts of leaves and seeds were prepared for HPLC as follows. For seed 

extracts, 10 mg of seed was added to 1 g of microbeads (Biospec) in a sterile 
microfuge tube to which 500 ul 1% pyrogallol (Sigma Chem)/ethanol was added. 
The mixture was shaken for 3 minutes in a mini Beadbeater (Biospec) on "fast" speed. 
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The extract was filtered through a 0.2 urn filter into an autosampler tube. The filtered 
extracts were then used in HPLC analysis described below. 

Leaf extracts were prepared by mixing 30-50 mg of leaf tissue with 1 g 
microbeads and freezing in liquid nitrogen until extraction. For extraction, 500 ul 1% 
5 pyrogallol in ethanol was added to the leaf/bead mixture and shaken for 1 minute on a 
Beadbeater (Biospec) on "fast" speed. The resulting mixture was centrifuged for 4 
minutes at 14,000 rpm and filtered as described above prior to HPLC analysis. 

HPLC was performed on a Zorbax silica HPLC column (4.6 mm X 250 mm) 
with a fluorescent detection, an excitation at 290 nm, an emission at 336 nm, and 
10 bandpass and slits. Solvent A was hexane and solvent B was methyl-t-butyl ether. 
The injection volume was 20 ul, the flow rate was 1.5 ml/min* the run time was 12 
min (40°C) using the gradient (Table 6): 

Table 6: 



Time 


Solvent A 


Solvent B 


0 min. 


90% 


10% 


10 min. 


90% 


10% 


11 min. 


25% 


75% 


12 min. 


90% 


10% 



20 

Tocopherol standards in 1% pyrogallol/ ethanol were also run for comparison 
(alpha tocopherol, gamma tocopherol, beta tocopherol, delta tocopherol, and 
tocopherol (tocol) (all from Matreya). 

Standard curves for alpha, beta, delta, and gamma tocopherol were calculated 
2 5 using Chemstation software. The absolute amount of component x is: Absolute 

amount of x= Response x x RF X x dilution factor where Response x is the area of peak x, 
RF X is the response factor for component x (Amountj/Response J and the dilution 
factor is 500 ul. The ng/mg tissue is found by: total ng component/mg plant tissue. 
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Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines 
containing pMON 10822 for the expression of ATPT2 from the napin promoter are 
provided in Figure 24. 

HPLC analysis results of segregating T2 Arabidopsis seed tissue expressing 
5 the ATPT2 sequence from the napin promoter (pCGN 10822) demonstrates an 
increased level of tocopherols in the seed. Total tocopherol levels are increased as 
much as 50% over the total tocopherol levels of non-transformed (wild-type) 
Arabidopsis plants (Figure 25). Homozygous progeny from the top 3 lines (T3 seed) 
have up to a two-fold (100%) increase in total tocopherol levels over control 
1 0 Arabidopsis seed ( Figure 26.) 

Furthermore, increases of particular tocopherols are also increased in 
transgenic Arabidopsis plants expressing the ATPT2 nucleic acid sequence from the 
napin promoter. Levels of delta tocopherol in these lines are increased greater than 3 
fold over the delta tocopherol levels obtained from the seeds of wild type Arabidopsis 
1 5 lines. Levels of gamma tocopherol in transgenic Arabidopsis lines expressing the 
ATPT2 nucleic acid sequence are increased as much as about 60% over the levels 
obtained in the seeds of non-transgenic control lines. Furthermore, levels of alpha 
tocopherol are increased as much as 3 fold over those obtained from non-transgenic 
control lines. 

2 0 Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines 

containing pCGN10803 for the expression of ATPT2 from the enhanced 35S 
promoter (antisense orientation ) are provided in Figure 25. Two lines were identified 
that have reduced total tocopherols, up to a ten-fold decrease observed in T3 seed 
compared to control Arabidopsis (Figure 27.) 



25 



5B. Canola 

Brassica napus, variety SP30021, was transformed withpCGN10822 (napin- 
ATPT2-napin 3', sense orientation) using Agi-obacterium tumefaciens-mediated 
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transformation. Flowers of the R0 plants were tagged upon pollination and 
developing seed was collected at 35 and 45 days after pollination (DAP). 

Developing seed was assayed for tocopherol levels, as described above for 
Arabidopsis. Line 10822-1 shows a 20% increase of total tocopherols, compared to 
5 the wild-type control, at 45 DAP. Figure 28 shows total tocopherol levels measured 
in developing canola seed. 



Example 6: Sequences to Tocopherol Cyclase 
6A. Preparation of the slrl 737 Knockout 

10 The Synechocystis sp. 6803 slrl 737 knockout was constructed by the 

following method. The GPS™-! Genome Priming System (New England Biolabs) 
was used to insert, by a Tn7 Transposase system, a Kanamycin resistance cassette 
into slrl 73 7. A plasmid from a Synechocystis genomic library clone containing 652 
base pairs of the targeted orf (Synechcocystis genome base pairs 1324051 - 1324703; 

15 the predicted orf base pairs 1323672 - 1324763, as annotated by Cyanobase) was used 
as target DNA. The reaction was performed according to the manufacturers protocol. 
The reaction mixture was then transformed into E. coli DH10B electrocompetaht cells 
and plated. Colonies from this transformation were then screened for transposon 
insertions into the target sequence by amplifying with Ml 3 Forward and Reverse 

2 0 Universal primers, yielding a product of 652 base pairs plus -1700 base pairs, the size 

* 

of the transposon kanamycin cassette, for a total fragment size of -2300 base pairs. 
After this determination, it was then necessary to determine the approximate location 
of the insertion within the targeted orf, as 100 base pairs of orf sequence was 
estimated as necessary for efficient homologous recombination in Synechocystis. This 
2 5 was accomplished through amplification reactions using either of the primers to the 
ends of the transposon, Primer S (5' end) or N (3 5 end), in combination with either a 
Ml 3 Forward or Reverse primer. That is, four different primer combinations were 
used to map each potential knockout construct: Primer S - M13 Forward, Primer S - 
Ml 3 Reverse, Primer N - Ml 3 Forward, Primer N - Ml 3 Reverse. The construct 



.t 
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used to transform Synechocystis and knockout slrl737 was determined to consist of a 
approximately 150 base pairs of sir 1737 sequence on the 5' side of the transposon 
insertion and approximately 500 base pairs on the 3' side, with the transcription of the 
orf and kanamycin cassette in the same direction. The nucleic acid sequence of 
5 slrl 737 is provided in SEQ ID NO:38 the deduced amino acid sequence is provided in 
SEQ ID NO:39. 

Cells of Synechocystis 6803 were grown to a density of - 2xl0 8 cells per ml 
and harvested by centrifugation. The cell pellet was re-suspended in fresh BG-1 1 
medium at a density of 1x1 0 9 cells per ml and used immediately for transformation. 

10 1 00 ul of these cells were mixed with 5 ul of mini prep DNA and incubated with light 
at 30C for 4 hours. This mixture was then plated onto nylon filters resting on BG-1 1 
agar supplemented with TES ph8 and allowed to grow for 12-18 hours. The filters 
were then transferred to BG-1 1 agar + TES + 5ug/ml kanamycin and allowed to grow 
until colonies appeared within 7-10 days (Packer and Glazer, 1988). Colonies were 

1 5 then picked into BG-1 1 liquid media containing 5 ug/ml kanamycin and allowed to 
grow for 5 days. These cells were then transferred to Bg-1 1 media containing 
lOug/ml kanamycin and allowed to grow for 5 days and then transferred to Bg-1 1 + 
kanamycin at 25ug/ml and allowed to grow for 5 days. Cells were then harvested for 
PCR analysis to determine the presence of a disrupted ORF and also for HPLC 

2 0 analysis to determine if the disruption had any effect .on tocopherol levels. 

PCR analysis of the Synechocystis isolates, using primers to the ends of the 
slrl 737 orf, showed complete segregation of the mutant genome, meaning no copies 
of the wild type genome could be detected in these strains. This suggests that 
function of the native gene is not essential for cell function. HPLC analysis of the 

2 5 strain carrying the knockout for slrl 737 produced no detectable levels of tocopherol. 



6B. The relation of slrl737 and slrl736 

The slrl 737 gene occurs in Synechocystis downstream and in the same 
orientation as slrl 736, the phytyl prenyltransferase. In bacteria this proximity often 
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indicates an operon structure and therefore an expression pattern (hat is linked in all 
genes belonging to this operon. Occasionally such operons contain several genes that 
are required to constitute one enzyme. To confirm that slrl737 is not required for 
phytyl prenyltransferase activity, phytyl prenyltransferase was measured in extracts 
5 from the Synechocystis sir 1737 knockout mutant. Figure 29 shows that extracts from 
the Synechocystis slrl 737 knockout mutant still contain phytyl prenyltransferase 
activity. The molecular organization of genes in Synechocystis 6803 is shown in A. 
Figures B and C show HPLC traces (normal phase HPLC) of reaction products 
obtained with membrane preparations from Synechocystis wild type and slrl 73 T 
10 membrane preparations, respectively. 

The fact that slrl 73 7 is not required for the PPT activity provides additional 
data that ATPT2 and slrl 736 encode phytyl prenyltransferases. 

. 6C Synechocystis Knockouts 

15 Synechocystis 6803 wild type and Synechocystis slrl 737 knockout mutant 

were grown photoautotrophically. Cells from a 20 ml culture of the late logarithmic 
growth phase were harvested and extracted with ethanol. Extracts were separated by 
isocratic normal-phase HPLC using a Hexane/Methyl-t-butyl ether (95/5) and a 
Zorbax silica column, 4.6 x 250 mm. Tocopherols and tocopherol intermediates were 

20 detected by fluorescence (excitement 290 nm, emission 336 nm) (Figure 30). 

« 

Extracts of Synechocystis 6803 contained a clear signal of alpha-tocopherol. 
2,3-Dimethyl-5-phytylplastoquinol was below the limit of detection in extracts from 
the Synechocystis wild type (C). In contrast, extracts from the Synechocystis slrl737 
knockout mutant did not contain alpha-tocopherol, but contained 2,3-dimethyl-5- 
2 5 phytylplastoquinol (D), indicating that the interruption of slrl 737 has resulted in a 
block of the 2,3-dimethyl-5-phytylplastoquinol cyclase reaction. 

Chromatograms of standard compounds alpha, beta, gamma, delta-tocopherol 
and 2,3-dimethyl-5-phytylplastoquinol are shown in A and B. Chromatograms of 
extracts form Synechocystis wild type and the Synechocystis slrl 73 7 knockout mutant 
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are shown in C and D, respectively. Abbreviations: 2,3-DMPQ, 2,3-dimethyl-5- 
phytylplastoquinol. 

6D. Incubation with Lysozyme treated Synechocystis 
5 Synechocystis 6803 wild type and sir 1737 knockout mutant cells from the late 

logarithmic growth phase (approximately 1 g wet cells per experiment in a total 
volume of 3 ml) were treated with Lysozyme and subsequently incubated with 
S-adenosylmethionine, and phytylpyrophosphate, plus radiolabelled homogentisic 
acid. After 17h incubation in the dark at room temperature the samples were extracted 

10 with 6 ml chloroform / methanol (1/2 v/v). Phase separation was obtained by the 

addition of 6 ml 0.9% NaCl solution. This procedure was repeated three times. Under 
these conditions 2,3-dimethyl-5-phytylplastoquinol is oxidized to form 2,3-dimethyl- 
5-pfiytylplastoquinone. 

The extracts were analyzed by normal phase and reverse phase HPLC. Using 

1 5 extracts from wild type Synechocystis cells radiolabelled gamma-tocopherol and 
traces of radiolabelled 2,3-dimethyl-5-phytylplastoquinone were detected. When 
extracts from the slrl737 knockout mutant were analyzed, only radiolabelled 2,3- 
dimethyl-5-phytylplastoquinone was detectable. The amount of 2,3-dimethyl-5- 
phytylplastoquinone was significantly increased compared to wild type extracts. Heat 

2 0 treated samples of the wild type and the slrl737 knockout mutant did not produce 
radiolabelled 2,3-dimethyl-5-phytylplastoquinone, nor radiolabelled tocopherols. 
These results further support the role of the slrl737 expression product in the 
cyclization of 2,3-dimethyl-5-phytylplastoquinol. 

25 6E. Arabidopsis Homologue to sir 1 73 7 

An Arabidopsis homologue to slrl737 was identified from a BLASTALL 
search using Synechocystis sp 6803 gene sir 1737 as the query, in both public and 
proprietary databases. SEQ ID NO: 109 and SEQ ID NO:l 10 are the DNA and 
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translated amino acid sequences, respectively, of the Arabidopsis homologue to 
slrl737. The start if found at the ATG at base 56 in SEQ ID NO:109. 

The sequences obtained for the homologue from the proprietary database 
differs from the public database (F4D1 1.30, BAC AL022537), in having a start site 
5 471 base pairs upstream of die start identified in the public sequence. A comparison 
of the public and proprietary sequences is provided in Figure 3 1 . The correct start 
correlates within the public database sequence is at 12080, while the public sequence 
start is given as being at 1 1609. 

Attempts to amplify a sir 173 7 homologue were unsuccessful using primers 
1 0 designed from the public database, while amplification of the gene was accomplished 
with primers obtained from SEQ ID NO: 1 09. 

Analysis of the protein sequence to identify transit peptide sequence predicted 
two potential cleavage sites, one between amino acids 48 and 49, and the other 
between amino acids 98 and 99. 

15 

6F. slrl737 Protein Information 

The sir 1737 orf comprises 363 amino acid residues and has a predicted MW of 
41kDa (SEQ ID NO: 39). Hydropathic analysis indicates the protein is hydrophillic 
(Figure 32). 

2 0 The Arabidopsis homologue to sir 1737 (SEQ ID xx) comprises 488 amino 

acid residues, has a predicted MW of 55kDa, and a has a putative transit peptide 
sequence comprising the first 98 amino acids. The predicted MW of the mature form 
of the Arabidopsis homologue is 44kDa. The hydropathic plot for the Arabidopsis 
homologue also reveals that it is hydrophillic (Figure 33). Further blast analysis of 

25 the Arabidopsis homologue reveals limited sequence identity (25 % sequence 

identity) with the beta-subunit of respiratory nitrate reductase. Based on the sequence 
identity to nitrate reductase, it suggests the slrl737 orf is an enzyme that likely 
involves general acid catalysis mechanism. 
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Investigation of known enzymes involved in tocopherol metabolism indicated 
that the best candidate corresponding to the general acid mechanism is the tocopherol 
cyclase. There are many known examples of cyclases including, tocopherol cyclase, 
chalcone isomerase, lycopene cyclase, and aristolochene synthase. By further 
5 examination of the microscopic catalytic mechanism of phytoplastoquinol cyclization, 
as an example, chalcone isomerase has a catalytic mechanism most similar to 
tocopherol cyclase. (Figure 34). 

Multiple sequence alignment was performed between sir 1737, slrl737 
Arabidopsis homologue and the Arabidopsis chalcone isomerase (Genbank;P41G88) 
(Figure 35). 65% of the conserved residues among the three enzymes are strictly 
conserved within the known chalcone isomerases. The crystal structure of alfalfa 
chalcone isomerase has been solved (Jez, Joseph M., Bowman, Marianne E., Dixon, 
Richard A., and Noel, Joseph P. (2000) "Structure and mechanism of the 
evolutionary unique plant enzyme chalcone isomerase". Nature Structural Biology 
7: 786-791 .) It has been demonstrated tyrosine (Y) 106 of the alfalfa chalcone 
isomerase serves as the general acid during cyclization reaction (Genbank: P28012). 
The equivalent residue in slrl737 and the slrl737 Arabidopsis homolog is lysine (K), 
which is an excellent catalytic residue as general acid. 

The information available from partial purification of tocopherol cyclase from 
Chlorella protothecoides (U.S. Patent No. 5,432,069), described as being glycine 
rich, water soluble and with a predicted MW of 48-50kDa, is consistent with the 
protein informatics information obtained for the sir 1737 and the Arabidopsis ski 737 
homologue. 

All publications and patent applications mentioned in this specification are 
indicative of the level of skill of those skilled in the art to which this invention 
pertains. All publications and patent applications are herein incorporated by reference 
to the same extent as if each individual publication or patent application was 
specifically and individually indicated to be incorporated by reference. 
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Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be obvious 
that certain changes and modifications may be practiced within the scope of the 
appended claim. 
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CLAIMS 

What is claimed is: 

1 . An isolated nucleic acid sequence encoding a tocopherol cyclase. 
5 2. An isolated nucleic acid sequence according to Claim 1 , wherein said tocopherol 
cyclase is active in the cyclization of 2,3-dimethyl-5-phytylplastoquinol to tocopherol. 
3. An isolated nucleic acid sequence according to Claim 1, wherein said tocopherol 
cyclase is active in the cyclization of 2,3-dimethyl-5-geranylgeranylplastoquinol to 
tocotrienol. 

10 4. An isolated DNA sequence according to Claim 1, wherein said nucleic acid sequence is 
isolated from a eukaryotic cell source, 

5. An isolated DNA sequence according to Claim 4, wherein said eukaryotic cell source is 
selected from the group consisting of mammalian, nematode, fungal, and plant cells. 

6. The DNA encoding sequence of Claim 5 wherein said tocopherol cyclase protein is from 
15 Arabidopsis. 

7. The DNA encoding sequence of Claim 6 wherein said tocopherol cyclase protein is 
encoded by a sequence of SEQ ID NO: 109. 

8. The DNA encoding sequence of Claim 7 wherein said tocopherol cyclase protein has an 
amino acid sequence of SEQ ID NO: 1 10. 

20 9. The DNA encoding sequence of Claim 4 wherein said tocopherol cyclase protein is from a 
source selected from the group consisting of Arabidopsis, soybean, corn, rice, wheat, leek 
canola, , leek, cotton, and tomato. 

10. An isolated DNA sequence according to Claim 4, wherein said prokaryotic source is a 
Synechocystis sp. 

25 11. The DNA encoding sequence of Claim 1 0 wherein said tocopherol cyclase protein is 
encoded by a sequence of SEQ ID NO:38. 

12. The DNA encoding sequence of Claim 10 wherein said tocopherol cyclase protein has an 
amino acid sequence of SEQ ID NO:39. 
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13. A nucleic acid construct comprising as operably linked components, a transcriptional 
initiation region functional in a host cell, a nucleic acid sequence encoding a tocopherol 
cyclase, and a transcriptional termination region. 

14. A nucleic acid construct according to Claim 13, wherein said nucleic acid sequence 

5 encoding tocopherol cyclase is obtained from an organism selected from the group consisting 
of a eukaiyotic organism and a prokaryotic organism. 

15. A nucleic acid construct according to Claim 14, wherein said nucleic acid sequence 
encoding tocopherol cyclase is obtained from a plant source. 

16. A nucleic acid construct according to Claim 15, wherein said nucleic acid sequence 

1 0 encoding tocopherol cyclase is obtained from a source selected from the group consisting of 
Arabidopsis, soybean, corn, rice, wheat, leek canola, , leek, cotton, and tomato. 

17. A nucleic acid construct according to Claim 13, wherein said nucleic acid sequence 
encoding tocopherol cyclase is obtained from a Synechocystis sp. 

1 8 . A plant cell comprising the construct of 1 3 . 
15 19. A plant comprising a cell of Claim 1 8. 

20 A feed composition produced from a plant according to Claim 19. 
2 1 . A seed comprising a cell of Claim 1 8. 
22 Oil obtained from a seed of Claim 21 . 

23. A natural tocopherol rich refined and deodorised oil which has been produced by 
20 a method of treating an oil according to Claim 22 by distilling under low pressure and 

high temperature, wherein said refined oil has reduced free fatty acids and a 
substantial percentage of tocopherol present in the pretreated oil. 

24. A refined oil according to claim 23, wherein the pretreated oil is crude or pre- 
treated soybean oil. 

25 25 . A refined oil according to claim 23 , wherein the refined oil is degummed and 
bleached. 

26. A method for the alteration of the isoprenoid content in a host cell, said method 
comprising; transforming said host cell with a construct comprising as operably linked 
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components, a transcriptional initiation region functional in a host cell, a nucleic acid 

sequence encoding tocopherol cyclase, and a transcriptional termination region, 

wherein said isoprenoid compound selected from the group of tocopherols and tocotrienols . 

27. The method according to Claim 26, wherein said host cell is selected from the group 
5 consisting of a prokaryotic cell and a eukaryotic cell. 

28. The method according to Claim 27, wherein said prokaryotic cell is a Synechocystis sp. 

29. The method according to Claim 27, wherein said eukaryotic cell is a plant cell. 

30. The method according to Claim 29, wherein said plant cell is obtained from a plant 
selected from the group consisting of Arabidopsis, soybean, com, rice, wheat, leek canola, , 

1 0 leek, cotton, and tomato. 

3 1 . A method for producing an isoprenoid compound of interest in a host cell, said method 
comprising obtaining a transformed host cell, said host cell having and expressing in its 
genome: 

a construct having a DNA sequence encoding a tocopherol cyclase operably linked to a 
1 5 transcriptional initiation region functional in a host cell, 

wherein said isoprenoid compound selected from the group of tocopherols and tocotrienols. 

32. The method according to Claim 31, wherein said host cell is selected from the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

33. The method according to Claim 32, wherein said prokaryotic cell is a Synechocystis sp. 
2 0 34. The method according to Claim 32, wherein said eukaryotic cell is a plant cell. 

35. The method according to Claim 34, wherein said plant cell is obtained from a plant 
selected from the group consisting wherein said compound selected from the group of 
Arabidopsis, soybean, corn, rice, wheat, leek canola, , leek, cotton, and tomato. 

36. A method for increasing the biosynthetic flux in a host cell toward production of 
25 an isoprenoid compound, said method comprising; transforming said host cell with a 

construct comprising as operably linked components, a transcriptional initiation 
region functional in a host cell, a DNA encoding a tocopherol cyclase, and a 
transcriptional termination region, wherein said isoprenoid compound selected from 
the group of tocopherols and tocotrienols,. 



WO 01/79472 



PCT/US01/12334 



61 

37. The method according to Claim 36, wherein said host cell is selected from the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

38. The method according to Claim 37, wherein said prokaryotic cell is a Sytiechocystis sp. 

39. The method according to Claim 37, wherein said eukaryotic cell is a plant cell. 

5 40. The method according to Claim 39, wherein said plant cell is obtained from a plant 
selected from the group consisting Arabidopsis, soybean, corn, rice, wheat, leek canola, , 
leek, cotton, and tomato. 

41 . The method according to Claim 39, wherein said transcriptional initiation region is a 
seed-specific promoter. 
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Figure 25 
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Query Sequence: F4D11 AL022537 

Database: PIR__T04448 .atcea.list. fasta 

Database: PIR_T04448 
Plus (+) denotes forward strand, and minus (-) reverse strand. 
Asterisks (*) denote bases not shown on pair wise alignments. 

Alignment 1 



Query- 
genomic 

ATCEA4C371+ 



Met 

Query- 
ATCEA4C371+ 



12194 C ACACGTTCTCGTCCTTTTCTTCTTCCTCTCTGCATTCTTCACAGAGTTTGTCACCACCA 

1 C est 

:.:.:.:.:. : first 



12134 



it 1 1 1 1 1 1 1 — I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i i i i i 1 1 1 1 1 1 1 

2 ACCCCAAACATCACAATTTCACATTCTTTTGCATATTTCTTCTTCTTCTTCCATTATGGA 



Query- 
ATCEA4C371+ 



12075 GATACGGAGCTTGATTGTTTCTATGAACCCTAATTTATCTTCCTTTGAGCTCTCTCGCCC 

MMIMIIMMMII IH I Mill 1 1 1 Mill il Ml llllill III 1 11)111 

62 GATACGGAGCTTGATTGTTTCTATGAACCCTAATTTATCTTCCTTTGAGCTCTCTCGCCC 



Query- 
ATCEA4C371+ 



12015 TGTATCTCCTCTCACTCGCTCACTAGTTCCGTTCCGATCGACTAAACTAGTTCCCCGCTC 
I M I II I Ml III I III II I III ill II llllill I IN Ml III IMI | if )|| | MM 
122 TGTATCTCCTCTCACTCGCTCACTAGTTCCGTTCCGATCGACT AAACTAGTTCCCCGCTC 



Query- 
ATCEA4C371+ 



11955 CATTTCT AGGGTTTCqfflHSATCTCCACCCCGAATAGTGAAACTGACAAGATCTCCGT 

I 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 S 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 } 1 1 1 1 1 1 1 

182 CATTTCTAGGGTTTCGGCGTCGATCTCCACCCCGAATAGTGAAACTGACAAGATCTCCGT 



Query- 

ATCEA4C371+ 

here 



11895 

M M I M 1 1 II II 1 1 M M 1 1 1 1 1 i I M 1 1 1 1 1 1 1 1 1 1 1 1 M M I IJJJJL] liLLii — 

2 42 TAAACCTGTTTACGTCCCGACGTCTCCCAATCGCGAACTCCGGACTf^g^fcAGjTGG; 

Synecho seq aligns from 



Query- 
ATCEA4C371+ 



11835 AATTGATCCATTCCATTCC ATTTCTCTTCTCTTGTTTGTTTT ATTAAGCTCCAATTTCAG 
299 



60 bp removed - — - 



Query- 11715 ************************************************ *****+#* TTT g 

ATCEA4C371+ 299 



PIR:T04448 



Query- 

ATCEA4C371+ 

PIR:T04448 



11655 GTGGCTCACCATTCGACGACTACTTTTGAATTTGAGTTTTTGAAAAATGCAATTTAACAT 
299 



M Q F N I 

arab sequence which is incorrect 



Figure 31A 
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Query- 

ATCEA4C371+ 

PIR:T04448 

Query- 

ATCEA4C371+ 

PIR:T04448 

Query- 

ATCEA4C371+ 

PIR:T04448 



11595 CAGAGAGTTTTTTTTTTTATGGTTGAT AACTTATTGTTTAACTTTTGAAAAATGCAGATH 

299 $kk 
I::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 
6REFFFLWLITYCLTFBKCRY 

• » • • ■ • 
11535 CCATTTCGATGGAACACCTCGGAAGTTCTTCGA6GGATGGTATTTCAGGGTTTCCATCCC 

1 1 m 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i iinm I I II I I I 

302 CCATTTCGATGGAAC^CCTCGGAAGTTCTTCGAGGGATGGTATTTqBHHTCCATCCC 
26HFDGTPRKFFEGWYF BH SIP 

• « • S • • J 
11475 AGAGAAGAGGGAGAGTTTTTGTTTTATGTATTCTGTGGAGAATCCTGCATTTCGGCAGAG 

iiiiii mi iii miiiimtmi Miimiiiimiiiiimimimii 

3 62 AGAGAAGAGGGAGAGTTTTTGTTTTATGTATTCTGTGGAGAATCCTGCATTTCGGCAGAG 
46 EKRES FC FMYSVENPA'FRQS 



Query- 11415 TTTGTCACCATTGGAAGTGGCTCTATATGGACCTAGATTCACTGGTGTTGGAGCTCAGAT 

II I I I II I 1 I I I | I I I I 1 1 I I I I I I I II I I I I I I I I I H I I M I I I I I I I I I I I I I I I I I 
ATCEA4C371+ 422 TTTGTCACCATTGGAAGTGGCTCTATATGGACCTAGATTCACTGGTGTTGGAGCTCAGAT 

PIR.-T04448 66 LSPLEVA.LYGPRFTGVGAQI 



Query- 

ATCEA4C371+ 

PXR:T04448 
ATCEA4C371+ 



11355 TCTTGGCGCTAATGATAAATATTTATGCCAATACGAACAAGACTCTCACAATTTCTGGGG 

IIIIIIMIIlllllllllllllllllllllllllllllll.MHHIItlllM 

482 TCTTGGCGCTAATGATAAATATTTATGCCAATACGAACAAGACTCTCACAATTTC 



* * « * 

• • • • 



■ • • • * 



86LGANDKYLCQYEQDSHNFWG 
Exon 11538 11301 Confidence: 100 100 



Query- 

ATCEA4C371+ 

PIR:T04448 
PIR:T04448 



11295 AGGTAACTCCTTGACCCTTAAAATGCTGTGTCATGAC AAT AAGAAATC AT ATCTGAGTCT 



537 



106 D 
Exon 



11609 



11294 Confidence: 100 100 



Query- 
PIR:T04448 



11235 TTTCTCT ACTTCT AGTACTAATGTTCGTTATTGTTGTTAAAGATCTAAGTCTTATCTGAA 



107 



Query- 
PIR:T04448 



11175 TTTTGTTACATTTTGGTTCTGGTGCTTTCTCAACATGAATTTGTATATATGACTTTAAAG 



107 



Query- 
PIR.-T04448 



11115 ATTGCTTACCTAAAGTTTTTACTC ATGCATAGATCGACATGAGCTAGTTTTGGGGAAT AC 



107 



RHELVLGNT 



Query- 

PIR.-T04448 
PIR:T04448 



11055 TTTTAGTGCTGTGCCAGGCGCAAAGGCTCCAAACAAGGAGGTTCCACCAGAGGTTCTCAC 

116 FSAVPGAKAPNKEVPPE 
Exon 11083 11004 Confidence: 96 100 



Figure 31 B 
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Query- 
PIR:T04448 



: • : •:•:.:. : 
10995 TCCTCCCT TGTTGGTT ACTTTGTT ATCTGTT AAAT AGTTTTCC AATTGT ATCCGGATA^T 



133 



Query- 
PIR:T04448 

Query- 
PIR:T04448 

Query- 

PIR:T04448 
PIR:T04448 

Query- 
PIR:T04448 

Query- 
PIR.-T04448 

Query- 
PIR:T04448 

Query- 
PIR:T04448 

Query- 

PIR:T04448 
PIR:T04448 

Query- 
PIR:T04448 



10935 GTTCTACTTCTCCTTGTAGAAAATCTCAAGTTTTTGT^^ 



133 



•••♦••I. ; 
10875 TTGATTTGTAAAGCATGTCGTTTTATTGTAGGAAT^ 

133 E F M R R V S Z Q T 

10815 CX^GCTACTCCATTTTGGCATCAAGGTCACATTTGCGATGATGGCCGGTAATTATATGA 



«««.•« 



143 QATPFWHQGHICDDGR 
Exon 10844 10768 Confidence: 100 100 

10755 TTCTATGCACAACAAGAATTCACTATATTATAAATATTGGATATTGAGTATTTTTGTTGA 
159 

10 69 5 AAATTTCTGTGTTTAAATCTGACTTGACTTGTTTTGTCAGTACTGACTATGCGGAAACTG 
159 T D Y A E T V 

10635 TGAAATCTGCTCGTTGGGAGTATAGTACTCGTCCCGTTTACGGTTGGGGTGATGTTGGGG 
166 KSARWEYSTRPV Y *G* W G D V G* A 

1 • : • : • i . : . : 

10575 CCAAACAGAAGTCAACTGCAGGCTGGCCTGCAGCTTTTCCTGTATTTGAGCCTCATTGGC 

j : : : :: : ....... . ...... . 

186 KQKSTAGWPAAFPVFEPHWQ 



10515 AGATATGCATGGCAGGAGGCCTTTCCACAGGTGTGAGCTTTGCTTGATTGACTTAAAGTT 

206 I CMAGGL3TG 

Exon 10655 10486 Confidence: 96 100 

• • • . 

•••••••:. t.: 

10 4 55 AATAAATAGACGGTTAAGTTTACTTGCCTAGTACTAACAGAAAATTAAGAAAGAAACCAC 
216 ~~ 



Query- 
PIR:T04448 



10395 CCTCTTTCTATCAGCAGAAACTGCTATTGTAGTTCTTATTTTTTCTCTTGTATTTGCAGG 



Query- 
PIR:T04448 

Query- 

PIR:T04448 
PIR:T04448 



10335 GTGGATAGAATGGGGCGGTGAAAGGTTTGAGTTTCGGGATGCACCTTCTTATTCAGAGAA 
216 WIEHGGERFEFRDAPS Y S E K. 



10275 GAATTGGGGTGGAGGCTTCCCAAGAAAATGGTTTTGGGTAAAAC^ 

______ 

236 NWGGGFPRKWFW 

Exon 10336 10239 Confidence: 96 100 



Figure 31 C 
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Query- 
PIR:T04448 



10215 ACATTTCTTGTTGCAGACTTT AGTTAGCTAGTGGACCTGTGTATRCACCCACATGTAGTA 
248 



Query- 
PIR.-T04448 



10155 T ACTTGTTTGAT AGCTTTATTTGT CAATGTCTCTTT ACAGGTCC AGTGTAATGTCTTTGA 
248 V Q C N V F E 



Query- 
PIR:T04448 



10095 AGGGGCAACTGGAGAAGTTGCTTT AACCGCAGGTGGCGGGTTGAGGCAATTGCCTGGATT 

• **•«•*•••••»#*#••*•■»•*»*•«**•»*•*•*•** • • »•**»#« ■ «•«•■••«#■ 

255 GATGEVALTAGGGLRQLPGL 



Query- 

PIR:T04448 
PIR:T04448 



10035 GACTGAGACCTATGAAAATGCTGCACTGGTATGCACTT ATAAGATCTTCTTAAGCAATGA 

• • • • • • • ••••«■•«•«•••■*•••••■ ■ — 

275 TETYENAAL 

Exon 10115 10008 Confidence: 100 100 



Query- 
PIR:T04448 



9975 CAGTGAGTATTAGAAGGCAGATAGTTTACAAAAGCTCTGGGCCCTTGTAAATCTGCAGGT 



284 



Query- 

PIR:T04448 

GSDB:S:495- 



9915 TTGTGTACACTATGATGGAAAAATGTACGAGTTTGTTCCTTGGAATGGTGTTGTTAGATG 
285 CVHYDGKMYEFVPWNGVVRW 

mm 

532 tagatg 



Query- 

PIR:T04448 

GSDB:S:495- 

PIR:T04448 

GSDB:S:495- 



9855 GGAAATGTCTCCCTGGGG TTATTGGTATATAACTGCAGAGAACGAAAACCATGTGGTAA 



• • m v • 



■ • • • • 



305 EMSPWG YWYITAENENHV 

1 I 1 t I I — I 1 t 1 I 1 I 1 1 I I — 1 I 1 I I t t 1 I 1 I I I I I I 1 1 I 1 I It il I I I 1 I 1 1 1 1 1 - 

526 ggaaat tctccctgggggttattggtatataactgcagagaNcgNaaaccatgtg 
Exon 9917 9801 Confidence: 100 100 

Exon 9861 9801 Confidence: 93 93 



Query- 

PrR:T04448 

GSDB:S:495- 

Query- 

PIR:T04448 

GSDB:S:495- 

Query- 

PIR:T04448 

GSDB:S:495- 



9796 ATTTGTTTTACTAGTTTCATTCAGTTTTACTTTTGACATCATATCATTCCCTTATGGCTA 
323 
471 

■ • * ■ • . 

9736 GATTCCAACACCCGATGAATGTCTTGTGACAGGTGGAACTAGAGGCAAGAACAAATGAAG 

323 VELEARTKEA 

MUM Mil MM | 1 | | | | | 1 1 1 1 1 | 

471 gtggaactagaggcNagaacaaatgaag 

• ■ • • • * 

9676 CGGGTACACCTCTGCGTGCTCCTACCACAGAAGTTGGGCTAGCTACGGCTTGCAGAGAT A 

333 GTPLRAPTTEVGLATACRDS 
M IHMMMMMMIIMMMM MMMMMM MIM IN MIMIIMMM 
443 cgggtacacctctgcgtgctcctaccacagaagttgggctagctacggcttgcagagata 



Query- 

PIR.-T04448 

GSDB:S:495- 



9616 GTTGTTACGGTGAATTGAAGTTGCAGATATGGGAACGGCTATATGATGGAAGTAAAGGCA 

353 C YGELKLQIWERL Y D G S K G K 

I I I ! 1 1 M I M M I I I i I I 1 1 1 1 1 1 1 I 1 1 M I I 1 1 1 1 1 1 1 I M M M 1 1 M i M I I i I M 
383 gttgttacggtgaattgaagttgcagatatgggaacggctatatgatggaagtaaaggca 



Figure 31 D 
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Query- 
PIR:T04448 

GSDB:S:495- 

PIR:T04448 

GSDB:S:495- 



9556 AGGTATGTATGCTAATGTGA1XXAATCCCTGTAGTTAAAAGTCTTAACAAATC 
373 



I 1" 
323 ag 
Exon 
Ex on 



LKVLTNPKA 



9704 
9704 



9555 Confidence: 100 100 
9555 Confidence: 98 100 



Query- 

PIR:T04448 

GSDB:S:495- 



9496 AGTGAAAGAAGATTATGAACGTTTGTTATGGTTAACAATGATGCAGGTGATATTAGAGAC 



382 
321 



VKEDYERLLWLTMMQ 



V I L E T 

■milium in 

gtgatattagagac 



Query- 
PIR:T04448 
GSDB:S: 495- 



9436 AAAGAGC TCAATGGCAGCAGTGG AGAT AGGAGG AG GACCG T GGTTTGG GACAT GGAAAGG 



402 



........ 

».*•»»«. 



....... 

• • • • • « a 



KSSMAAVEIGGGPWFGTWKG 

mimmiiiMi i i i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

307 aaagagctcaatggcaNcagtggagataggaggaggaccgtggtttgggacatggaaagg 



Query- 

PIR:T04448 

GSDB:S:495- 



9376 AGATACGAGCAA(^CGCCCGAGCfACTAAAACAGGCTCTTCAGGTCCCATTGGATCTTGA 



422 



247 



DTSNTPELLKQALQVPLDLE 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ] 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ] | ] 1 1 1 { 1 1 1 1 1 1 

agatacgagcaacacgcccgagctactaaaacaggctcttcaggtcccattggatcttga 



Query- 
(stop) 

PIR:T04448 

GSDB:S:495- 
PIR:T04448 



9316 AAGCGCCTTAGGTTTGGTCCCTTTCTTCAAGCCACCGGGTCTG TAM 



442 



SALGLVPFFKPPGL 

1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 ! 1 1 1 1 1 1 i 1 M 1 1 1 1 1 ] 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 

187 aagcgccttaggtttggtccctttcttcaagccaccgggtctgtaacattgatgagtgtt 
Exon 9522 9274 Confidence: 100 100 



Query- 9256 
PIR:T04448 456 



GSDB:S:495- 127 




i M 1 1 1 1 1 1 1 1 U U 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 ] M 1 1 1 1 1 1 1 1 1 1 1 1 1 K 1 1 1 1 1 K 1 1 1 1 1 1 
ttgtttgttgatagagacccatgtgatgaatgaagccttagtcatgtcattgctagcttc 



Query- 
GSDB:S;495- 



9196 ACTATTATGTATGTATGATTTTAGTTCGTTCGGTCCTTGTGGTAAATGATACGGGCCAGT 

1 1 1 1 M 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
actattatgtatgtatgattttagttcgttcggtccttgtggtaaatgatacgggccagt 



67 



Query- 

GSDB:S:495- 
GSDB:S:495- 



9136 GTAAAGTCTAGTTCAATAAAAGCCTTGAGTCGCATAATTTCAATTTCAAATTGCATC 

mini 

7 gtaaagt 

Exon 9450 9130 Confidence: 98 100 



ATCEA4C37145_1 3063693/emb) CAA18584.1) 4.0e-43 (AL022537) putative protein 
[Arabidopsis thaliana] 

PIR:T04448 3PIR-T04448 shypothetical protein F4D11.3G - Arabidopsis thaliana; 
g3063693|emb|CAAl8584.1 (AL022537) putative protein [Arabidopsis thaliana) _F4D11 .30 

GSDB:S:4955486|AI995392|AI995392|701673779 A. thaliana, Columbia Col-0, inflorescence- 
1 Arabidopsis thaliana cDNA clone 701673779, mRNA sequence. 



Figure 31 E 
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Figure 32 
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1-5 



o 
o 

CO 



8.5 



0 



-9.5 



-1 



-1.5 



-2.5 




280 
Position 



388 358 



Figure 33 
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i 



Phylo plasto quinol 



Tocopherol 
Cyclase 





R = HorCH3 



Tocopherol 



I 



Chalcone 
Isomerase 



OH B r 





Lycopene 
Cyclase 



Carotene 




C 0PP 
Franesyl diphosphate 



PPi 



Aristolocaene 
Synthase 
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slrl737_SYNSP_S74814 
s 1 r 1 7 3 7_ARATH_T 0 4 4 4 8' 
CFI_ARATH_P41088_ 

sir 17 37_SYNSP_S7 4 814 
sir 17 37_ARATH_T04 448' 
CFI_ARATH_P4 108 8_ 

slrl737_SYNSP_S74814 
s 1 r 1 7 3 7_ARATH_T 0 4 4 4 8" 
CFI_ARATH_P4 108 8_ 

slrl737_SYNSP_S74814 
s lr 1 7 3 7_ARATH_T0 4 4 4 8" 
CFI_ARATH_P4 108 8_ 

slrl737_SYNSP_S74814 
slrl737_ARATH_T04448" 
CFI_ARATH_P 4 1 0 8 8_ 

slrl737__SYNSP_S74814 
slrl737_ARATH_T04 448~ 
CFI_ARATHJ>41088_ 

slrl737_SYNSP_S74814 
s 1 r 1 7 3 7 _ARAT H_T 0 4 4 4 8 " 
CFI_ARATH_P4 1 0 8 8_ 

slrl737_SYNSP_S74814 
slrl7 37_ARATH_T04 4 4 8~ 
CFI_ARATH_P41088_ 

slrl737_SYNSP_S74814 
slrl737_ARATH_T04448~ 
C FI_ARAT H_P 4 1 0 8 8_ 

slrl737_SYNSP_S74814 
slrl7 37_ARATH_T04 4 4 8" 
CFI ARATH P41088 



MEIRSLIVSMNPNLSSFELSRPVSPLTRSLVPFRSTKLVPRSISRVSASI 

KFP PHSGYHWQGQS-PFFEGWYVRLL 

ST PNS ETDKI S VKP VYVPT S PNRELRT PHSGYH FDGTPRKFFEGWYFRVS 

LPQSGESFAFMYSIENPASDHHYGGGAVQILGPATK KQENQEDQLV 

IPEKRESFCFMYSVENPAFRQSLSPLEVALYGPRFTGVGAQILGANDKYL 

MSSSNACASPSPFPA VTKLHVDSV- 

WRT FPS VKKFWAS PRQFALG- HWGKCRDNRQ- AKPLLS EEFFAT VKEG YQ 
CQYEQDSHNFWGDRHELVLGNTFSAVPGAKAPNKEVPPEEFNRRVSEGFQ 
— TFVPSVKSPASSNPLFLG-GAGVRGLDIQ-GK FVIFTVIGVY 

IHQNQHQGQIIHGDR HCRWQFTVEPEVTWGS PNRFPRATAGW 

AT P FWHQGH I C DDGRTD YAETVKSARWE YSTRPVYGWGDVGAKQKSTAGW 
LEGNAVPSLSV KWKGKTTEELTES I PFFREI VTGAF 

LSFLPLFDPGWQILLAQGRAHGWLKWQREQYEFDHALVYAEKNWGHSFPS 
PAAFPVFEPHWQICMAGGLSTGWIEWGGERFEFRDAPSYSEKNWGGGFPR 
EKFIKVT M ; KLPLTGQQYSEKVTENC 

RWFWLQANYFPDHPG-LSVTAAGGERIVLGRPE EVALIGLHHQGNFY 

KWFWVQCNVFEGATGEVALTAGGGLRQLPGLTETYENAALVCVHYDGKMY 
VAIWKQLGLYTDCEA-KAV EKFLEIFKE ET 

EFGPGHGTVTWQVAPWGRWQLKASNDRYWVKLSGKTDKKGSLVHTP-TAQ 
EFVPWNGWRWEMS PWGYWYI TAENENHVVELEARTNEAGTPLRAPTTEV 
-FPPG-SSILFALSPTGSLTVAFSKDDS-IPETGIAVIENKLLAEA-VLE 

GLQLNCRDTTRGYLYLQLGSVGHG LIVQGETDTAGLEVGG 

GLATACRDSCYGELKLQIWERLYDGSKGKVILETKSSMAAVEIGGGPWFG 
— S I IGKNGVSPGTRLSVAERLSQ LMMKNKDEKEVSDHSL 

DWGLTEENLSKKT VPF 

TWKGDTSNTPELLKQALQVPLDLESALGLVPFFKPPGL 
EEKLAKEN 



/ 



Figure 35 
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SEQUENCE LISTING 

<110> Subramaniam, Sai 
Slater, Steven 
Karberg , Katherine 
Chen, Ridong 
Valentin, Henry 
Huang Wong, Yun-Hua 

<I20> Nucleic Acid Sequences Involved in 
Tocopherol Synthesis 

<130> MOCO.008.00WO 
15 

<150> 09/549,848 
<151> 2000-04-15 ' 

<150> 09/688,069 
20 <1S1> 2000-10-15 

<160> 94 

<170> FastSEQ for Windows Version 4.0 

25 

<210> 1 
<211> 1182 
<212> DNA 

<213> Arabidopsis sp 

30 

<400> 1 

atggagtctc tgctctctag ttcttctctt gtttccgctg ctggtgggtt ttgttggaag 60 
aagcagaatc taaagctcca ctctttatca gaaatccgag ttctgcgttg tgattcgagt 120 
aaagttgtcg caaaaccgaa gtttaggaac aatcttgtta ggcctgatgg tcaaggatct 180 
35 tcattgttgt tgtatccaaa acataagtcg agatttcggg ttaatgccac tgcgggtcag 240 
cctgaggctt tcgactcgaa tagcaaacag aagtctttta gagactcgtt agatgcgttt 300 
tacaggtttt ctaggcctca tacagttatt ggcacagtgc ttagcatttt atctgtatct 360 
ttcttagcag tagagaaggt ttctgatata tctcctttac ttttcactgg catcttggag 420 

1 



5 



10 



i 



WO 01/79472 



PCT/US01/12334 



10 



gctgttgttg 


cagctctcat 


gatgaacatt 


tacatagttg 


ggctaaatca gttgtctgat 


480 


gttgaaatag 


ataaggttaa 


caagccctat 


cttccattgg 


catcaggaga atattctgtt 


540 


aacaccggca 


ttgcaatagt 


agcttccttc 


tccatcatga 


gtttctggct tgggtggatt 


600 


gttggttcat 


ggccattgtt 


ctgggctctt 


tttgtgagtt 


tcatgctcgg .tactgcatac 


660 


tctatcaatt 


tgccactttt 


acggtggaaa 


agatttgcat 


tggttgcagc aatgtgtatc 


720 


ctcgctgtcc 


gagctattat 


tgttcaaatc 


gccttttatc 


tacatattca gacacatgtg 


780 


tttggaagac 


caatcttgtt 


cactaggcct 


cttattttcg 


ccactgcgtt tatgagcttt 


840 


ttctctgtcg 


ttattgcatt 


gtttaaggat 


atacctgata 


tcgaagggga taagatattc 


900 


ggaatccgat 


cattctctgt 


aactctgggt 


cagaaacggg 


tgttttggac atgtgttaca 


960 


ctacttcaaa 


tggcttacgc 


tgttgcaatt 


ctagttggag 


ccacatctcc attcatatgg 


1020 


agcaaagtca 


tctcggttgt 


gggtcatgtt 


atactcgcaa 


caactttgtg ggctcgagct 


1080 


aagtccgttg 


atctgagtag 


caaaaccgaa 


ataacttcat 


gttatatgtt catatggaag 


1140 


ctcttttatg 


cagagtactt 


gctgttacct 


tttttgaagt 


ga 


1182 



15 <210> 2 

<211>,393 
<212> PRT 

<213> Arabidopsis sp 



20 <400> 2 

Met Glu Ser Leu Leu Ser Ser Ser Ser Leu Val Ser Ala Ala Gly Gly 

1 5 10 15 

Phe Cys Trp Lys Lys Gin Aan Leu Lys Leu His Ser Leu Ser Glu He 

20 25 30 

25 Arg Val Leu Arg Cys Asp Ser Ser Lys Val Val Ala Lys Pro Lys Phe 

35 40 45 

Arg Asn Asn Leu Val Arg Pro Asp Gly Gin Gly Ser Ser Leu Leu Leu 

50 55 60 

Tyr Pro Lys His Lys Ser Arg Phe Arg Val Asn Ala Thr Ala Gly Gin 
30 65 70 75 80 

Pro Glu Ala Phe Asp Ser Asn Ser Lys Gin Lys Ser Phe Arg Asp Ser 

85 90 95 

Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val He Gly Thr 

100 105 110 

35 Val Leu Ser He Leu Ser Val Ser Phe Leu Ala Val Glu Lys Val Ser 

115 120 125 

Asp He Ser Pro Leu Leu Phe Thr Gly He Leu Glu Ala Val Val Ala 

130 135 140 

Ala Leu Met Met Asn He Tyr He Val Gly Leu Asn Gin Leu Ser Asp 



» 



WO 01/79472 PCTYUS01/12334 

145 150 155 160 

Val Glu lie Asp Lys Val Asn Lys Pro Tyr Leu Pro Leu Ala Ser Gly 

165 170 175 

Glu Tyr Ser Val Asn Thr Gly lie Ala He Val Ala Ser Phe Ser He 
5 180 185 190 

Met Ser Phe Trp Leu Gly Trp He Val Gly Ser Trp Pro Leu Phe Trp 

195 200 205 

Ala Leu Phe Val Ser Phe Met Leu Gly Thr Ala Tyr Ser He Asn Leu 
210 215 220 

10 Pro Leu Leu Arg Trp Lys Arg Phe Ala Leu Val Ala Ala Met Cys He 
225 230 235 240 

Leu Ala Val Arg Ala He He Val Gin He Ala Phe Tyr Leu His He 

245 250 255 

Gin Thr His Val Phe Gly Arg Pro He Leu Phe Thr Arg Pro Leu He 
15 260 265 270 

Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Val Val He Ala Leu Phe 

275 280 285 

Lys Asp He Pro Asp He Glu Gly Asp Lys He Phe Gly He Arg Ser 
290 295 300 

20 Phe Ser Val Thr Leu Gly Gin Lys Arg Val Phe Trp Thr Cys Val Thr 
305 310 315 320 

Leu Leu Gin Met Ala Tyr Ala Val Ala He Leu Val Gly Ala Thr Ser 

325 330 335 

Pro Phe He Trp Ser Lys Val He Ser Val Val Gly His Val lie Leu 
25 340 345 350 

Ala Thr Thr Leu Trp Ala Arg Ala Lys Ser Val Asp Leu Ser Ser Lys 

355 360 365 

Thr Glu He Thr Ser Cys Tyr Met Phe He Trp Lys Leu Phe Tyr Ala 
370 375 380 

30 Glu Tyr Leu Leu Leu Pro Phe Leu Lys 
385 390 



<210> 3 
<211> 1224 
35 <212> DNA 

<213> Arabidopsis sp 

<400> 3 

atggcgtttt ttgggctctc ccgtgtttca agacggttgt tgaaatcttc cgtctccgta 60 



3 



WO 01/79472 PCT/US01/12334 





actccatctt 


cttcctctgc 


tcttttgcaa tcacaacata aatccttgtc 


caatcctgtg 


120 




actacccatt 


acacaaatcc 


tttcactaag tgttatcctt catggaatga 


taattaccaa 


180 




gtatggagta 


aaggaagaga 


attgcatcag gagaagtttt ttggtgttgg 


ttgg'aattac 


240 




agattaattt 


gtggaatgtc 


gtcgtcttct tcggttttgg agggaaagcc 


gaagaaagat 


300 


5 


gataaggaga 


agagtgatgg 


tgttgttgtt aagaaagctt cttggataga 


tttgtattta 


360 




ccagaagaag 


ttagaggtta 


tgctaagctt gctcgattgg ataaacccat 


tggaacttgg 


420 




ttgcttgcgt 


ggccttgtat 


gtggtcgatt gcgttggctg ctgatcctgg 


aagccttcca 


480 




agttttaaat 


atatggcttt 


atttggttgc ggagcattac ttcttagagg 


tgctggttgt 


540 




actataaatg 


atctgcttga 


tcaggacata gatacaaagg ttgatcgtac 


aaaactaaga 


600 


10 


cctatcgcca 


gtggtctttt 


gacaccattt caagggattg gatttctcgg 


gctgcagttg 


660 




cttttaggct 


tagggattct 


tctccaactt aacaattaca gccgtgtttt 


aggggcttca 


720 




tctttgttac 


ttgtcttttc 


ctacccactt atgaagaggt ttacattttg 


gcctcaagcc 


780 




tttttaggtt 


tgaccataaa 


ctggggagca ttgttaggat ggactgcagt 


taaaggaagc 


840 




atagcaccat 


cfcattgtact 


ccctctctat ctctccggag tctgctggac 


ccttgtttat 


900 


15 


gatactattt 


atgcacatca 


ggacaaagaa gatgatgtaa aagttggtgt ' 


taagtcaaca 


960 




gcccttagat 


tcggtgataa 


tacaaagctt tggttaactg gatttggcac 


agcatccata 


1020 




ggttttcttg 


cactttctgg 


attcagtgca gatctcgggt ggcaatatta cgcatcactg 


1080 




gccgctgcat 


caggacagtt 


aggatggcaa atagggacag ctgacttatc 


atctggtgct 


1140 




gactgcagta 


gaaaatttgt 


gtcgaacaag tggtttggtg ctattatatt 


tagtggagtt 


1200 


20 


gtacttggaa 


gaagttttca 


■ 

ataa 




1224 



s 

<210> 4 
<211> 407 
<212> PRT 
25 <213> Arabidopsis sp 



<400> 4 

Met Ala Phe Phe Gly Leu Ser Arg Val Ser Arg Arg Leu Leu Lys Ser 
15 10 15 

4 

30 Ser Val Ser Val Thr Pro Ser Ser Ser Ser Ala Leu Leu Gin Ser Gin 

20 25 30 

His Lys Ser Leu Ser Asn Pro Val Thr Thr His Tyr Thr Asn Pro Phe 

35 40 45 

Thr Lys Cys Tyr Pro Ser Trp Asn Asp Asn Tyr Gin Val Trp Ser Lys 
35 50 55 60 

Gly Arg Glu Leu His Gin Glu Lys Phe Phe Gly Val Gly Trp Asn Tyr 
65 70 75 80 

Arg Leu He Cys Gly Met Ser Ser Ser Ser Ser Val Leu Glu Gly Lys 

85 90 95 



4 



* 



WO 01/79472 



PCT/US01/12334 



Pro Lys Lys Asp Asp Lys Glu Lys Ser Asp Gly Val Val Val Lys Lys 

100 105 110 

Ala Ser Trp lie Asp Leu Tyr Leu Pro Glu Glu Val Arg Gly Tyr Ala 
115 120 125 

5 Lys Leu Ala Arg Leu Asp Lys Pro He Gly Thr Trp Leu Leu Ala Trp 
130 135 140 

Pro Cys Met Trp Ser He Ala Leu Ala Ala Asp Pro Gly Ser Leu Pro 
145 150 155 160 

Ser Phe Lys Tyr Met Ala Leu Phe Gly Cys Gly Ala Leu Leu Leu Arg 
10 165 170 175 

Gly Ala Gly Cys Thr He Asn Asp Leu Leu Asp Gin Asp He Asp Thr 

180 185 190 

Lys Val Asp Arg Thr Lys Leu Arg Pro He Ala Ser Gly Leu Leu Thr 
195 200 205 

15 Pro Phe Gin Gly He Gly Phe Leu Gly Leu Gin Leu Leu Leu Gly Leu 
210 215 220 

Gly He Leu Leu Gin Leu Asn Asn Tyr Ser Arg Val Leu Gly Ala Ser 
225 230 235 240 

Ser Leu Leu Leu Val Phe Ser Tyr Pro Leu Met Lys Arg Phe Thr Phe 
20 245 250 255 

Trp Pro Gin Ala Phe Leu Gly Leu Thr He Asn Trp Gly Ala Leu Leu 

260 265 270 

• Gly Trp Thr Ala Val Lys Gly Ser He Ala Pro Ser He Val Leu Pro 
275 280 285 

25 Leu Tyr Leu Ser Gly Val Cys Trp Thr Leu Val Tyr Asp Thr He Tyr 
290 295 300 

Ala His Gin Asp Lys Glu Asp Asp Val Lys Val Gly Val Lys Ser Thr 
305 310 315 320 

Ala Leu Arg Phe Gly Asp Asn Thr Lys Leu Trp Leu Thr Gly Phe Gly 
30 325 330 335 

Thr Ala Ser He Gly Phe Leu Ala Leu Ser Gly Phe Ser Ala Asp Leu 

340 345 350 

Gly Trp Gin Tyr Tyr Ala Ser Leu Ala Ala Ala Ser Gly Gin Leu Gly 
355 360 365 

35 Trp Gin He Gly Thr Ala Asp Leu Ser Ser Gly Ala Asp Cys Ser Arg 
370 . 375 380 

Lys Phe Val Ser Asn Lys Trp Phe Gly Ala He He Phe Ser Gly Val 
385 390 395 400 

Val Leu Gly Arg Ser Phe Gin 
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<210> 5 
<211> 1296 



<212> DNA 



405 



<213> Arabidopsis sp 



<400> 5 
atgtggcgaa 

10 ccaaacccta 
cctccggtct 
aatcgagttt 
tcgtctagag 
ctttctaaag 

15 ggtacgggaa 
atgattgctg 
atgaaaagaa 
gcatgggcta 
ttggctgctg 

20 aagcaacttc 
cttgggtggg 
ctttactttt 
gcagctggag 
gtggctctaa 

25 ttaacctcaa 
gcattttcat 
cttctcttcc 
cagcaacaac 
cagaggcgaa 

30 cctttcctcc 



gatctgttgt 
gactgattcc 
cgacggaatc 
ttgccactgc 
ttgcggcttt 
ctaaacttag 
atgctgcaat 
catctgctaa 
cgatgctaag 
ctattgctgg 
gacttgcatc 
accctatcaa 
cggcagcgtc 
ggcagatacc 
gttacaagat 
ggaactgctt 
gttggttttg 
tctaccgaga 
ttcctgtttt 
tcgtagaaga 
agaaacgtgt 
cagctccttc 



ttctcgttta 
ttggtcccgc 
aactgctaag 
tactgccgcc 
ggctggatta 
tatgcttgtg 
tagcttcccg 
ttccttgaat 
gccattgcct 
tgcttctggt 
tgccaatctt 
tacatgggtt 
tggtcagatt 
tcattttatg 
gttgtcactc 
ttacatgatc 
cctcgaatca 
ccggaccatg 
catgtctggt 
agccggatta 
ggctcaacct 
cttctactct 



tcttcaagaa 
gaattatgtg 
ttagggatca 
gctacagcta 
gggcatcact 
gttgcaactt 
gggctttgtt 
cagatttttg 
tcaggacgta 
gcttgtttgt 
gtactttatg 
ggcgctgttg 
tcatacaatt 
gcccttgcac 
tttgatccgt 
cctctcggtt 
acacttctca 
cataaagcaa 
cttcttctac 
acaaattctg 
ccggtggctt 
ccatga 



tctctgtttc 
ccgttaatag 
ctggtgttag 
cagctaccac 
acgctcgttg 
ctggaactgg 
acacatgtgc 
agataagcaa 
ttagtgttcc 
tggccagcaa 
cgtttgttta 
ttggtgctat 
cgatgattct 
atctctgccg 
cagggaagag 
tcatcgccta 
cactagcaat 
ggaaaatgtt 
accgtgtctfc 
tatctggtga 
atgcctctgc 



ttcttcgtta 60 

cttctcccag 120 

atctgatgcc 180 

cggtgagatt 240 

ttattgggag 300 

gtatattctg 360 

aggaaccatg 420 

tgattctaag 480 

acacgctgtt 540 

gactaatatg 600 

tactccgttg 660 

cccacccttg 720 

tccagctgct 780 

caatgattat 840 

aatagcagca 900 

tgactggggg 960 

cgctgcaaca 1020 

ccatgccagt 1080 

taatgataat 1140 

agtcaaaact 1200 

tgcaccgttt 1260 

1296 



<210> 6 
<211> 431 
<212> PRT 
35 <213> Arabidopsis sp 



<400> 6 

Met Trp Arg Arg Ser Val Val Tyr Arg Phe Ser Ser Arg. lie Ser Val 
15 10 15 
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Ser Ser Ser Leu Pro Asn Pro Arg Leu lie Pro Trp Ser Arg Glu Leu 

20 25 30 

Cys Ala Val Asn Ser Phe Ser Gin Pro Pro Val Ser Thr Glu Ser Thr 
35 40 45 

5 Ala Lys Leu Gly lie Thr Gly Val Arg Ser Asp Ala Asn Arg Val Phe 
50 55 60 

Ala Thr Ala Thr Ala Ala Ala Thr Ala Thr Ala Thr Thr Gly Glu He 
65 70 75 80 

Ser Ser Arg Val Ala Ala Leu Ala Gly Leu Gly His His Tyr Ala Arg 
10 85 90 95 

Cys Tyr Trp Glu Leu Ser Lys Ala Lys Leu Ser Met Leu Val Val Ala 

100 105 110 

Thr Ser Gly Thr Gly Tyr He Leu Gly Thr Gly Asn Ala Ala He Ser 
115 120 125 

15 Phe Pro Gly Leu Cys Tyr Thr Cys Ala Gly Thr Met Met He Ala Ala 
130 135 140 

Ser Ala Asn Ser Leu Asn Gin He Phe Glu He Ser Asn Asp Ser Lys 
145 150 155 160 

Met Lys Arg Thr Met Leu Arg Pro Leu Pro Ser Gly Arg He Ser Val 
20 165 170 175 

Pro His Ala Val Ala Trp Ala Thr^Ile Ala Gly Ala Ser Gly Ala Cys 

180 185 190 

Leu Leu Ala Ser Lys Thr Asn Met Leu Ala Ala Gly Leu Ala Ser Ala 
195 200 205 

25 Asn Leu Val Leu Tyr Ala Phe Val Tyr Thr Pro Leu Lys Gin Leu His 

■ 

210 215 220 

Pro He Asn Thr Trp Val Gly Ala Val Val Gly Ala He Pro Pro Leu 
225 230 235 240 

Leu Gly Trp Ala Ala Ala Ser Gly Gin lie Ser Tyr Asn Ser Met He 
30 245 250 255 

Leu Pro Ala Ala Leu Tyr Phe Trp Gin He Pro His Phe Met Ala Leu 

* 

260 265 270 

Ala His Leu Cys Arg Asn Asp Tyr Ala Ala Gly Gly Tyr Lys Met Leu 
275 280 285 

35 Ser Leu Phe Asp Pro Ser Gly Lys Arg He Ala Ala Val Ala Leu Arg 
290 295 300 

Asn Cys Phe Tyr Met He Pro Leu Gly Phe lie Ala Tyr Asp Trp Gly 
305 310 315 320 

Leu Thr Ser Ser Trp Phe Cys Leu Glu Ser Thr Leu Leu Thr Leu Ala 
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325 

He Ala Ala Thr Ala 

340 

Ala Arg Lys Met Phe 
5 355 

Ser Gly Leu Leu Leu 
370 

Val Glu Glu Ala Gly 
385 

10 Gin Arg Arg Lys Lys 

405 

Ala Ala Pro Phe Pro 

420 



330 

Phe Ser Phe Tyr Arg Asp 

345 

His Ala Ser Leu Leu Phe 
360 

His Arg Val Ser Asn Asp 
375 

Leu Thr Asn Ser Val Ser 
390 395 
Arg Val Ala Gin Pro Pro 

410 

Phe Leu Pro Ala Pro Ser 

425 
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335 

Arg Thr Met His Lys 
350 

Leu Pro Val Phe Met 
365 

Asn Gin Gin Gin Leu 
380 

Gly Glu Val Lys Thr 

400 

Val Ala Tyr Ala Ser 

415 

Phe Tyr Ser Pro 
430 



15 <210> 7 

<211> 479 
<212> DNA 

<213> Arabidopsis sp 



20 <400> 7 



ggaaactccc 


ggagcacctg 


tttgcaggta 


ccgctaacct 


taatcgataa 


tttatttctc 


60 


ttgtcaggaa 


ttatgtaagt 


ctggtggaag 


gctcgcatac 


catttttgca ttgcctttcg 


120 


ctatgatcgg 


gtttactttg 


ggtgtgatga 


gaccaggcgt 


ggctttatgg tatggcgaaa 


180 


acccattttt 


atccaatgct 


gcattccctc 


ccgatgattc 


gttctttcat 


tcctatacag 


240 


gtatcatgct 


gataaaactg 


ttactggtac 


tggtttgtat 


ggtatcagca 


agaagcgcgg 


300 


cgatggcgtt 


taaccggtat 


ctcgacaggc 


attttgacgc 


gaagaacccg 


cgtactgcca 


360 


tccgtgaaat 


acctgcgggc 


gtcatatctg 


ccaacagtgc 


gctggtgttt 


acgataggct 


420 


gctgcgtggt 


attctgggtg 


gcctgttatt 


tcattaacac 


gatctgtttt 


tacctggcg 


479 



30 <210> 8 

<211> 551 
<212> DNA 

<213> Arabidopsis sp 



35 <220> 

<221> misc_f eature 

<222> (1) . . . (551) 

<223> n = A,T,C or G 



8 
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<400> 8 

ttgtggctta caccttaatg agcatacgcc agnccattac ggctcgttaa tcggcgccat . 60 

ngccggngct gntgcaccgg tagtgggcta ctgcgccgtg accaatcagc ttgatctagc 120 

ggctcttatt ctgttfcttaa ttttactgtt ctggcaaatg ccgcattttt acgcgatttc 180 

5 cattttcagg ctaaaagact tttcagcggc ctgtattccg gtgctgccca tcattaaaga 240 

cctgcgctat accaaaatca gcatgctggt ttacgtgggc ttatttacac tggctgctat 300 

catgccggcc ctcttagggt atgccggttg gatttatggg atagcggcct taattttagg 360 

cttgtattgg ctttatattg ccatacaagg attcaagacc gccgatgatc aaaaatggtc 420 

tcgtaagatg tttggatctt cgattttaat cattaccctc ttgtcggtaa tgatgcttgt 480 

10 ttaaacttac tgcctcctga agtttatata tcgataattt cagcttaagg aggcttagtg 540 

gttaattcaa t 551 

<210> 9 
<211> 297 
15 <212> PRT 

<213> Arabidopsis sp 

<400> 9 

Met Val Leu Ala Glu Val Pro Lys Leu Ala Ser Ala Ala Glu Tyr Phe 
20 1 5 10 15 

■ 

Phe Lys Arg Gly Val Gin Gly Lys Gin Phe Arg Ser Thr He Leu Leu 

20 25 30 

Leu Met Ala Thr Ala Leu Asn Val Arg Val Pro Glu Ala Leu He Gly 
35 40 45 

25 Glu Ser Thr Asp He Val Thr Ser Glu Leu Arg Val Arg Gin Arg Gly 
50 55 60 

He Ala Glu He Thr Glu Met He His Val Ala Ser Leu Leu His Asp 
65 70 75 80 

Asp Val Leu Asp Asp Ala Asp Thr Arg Arg Gly Val Gly Ser Leu Asn 
30 85 90 95 

Val Val Met Gly Asn Lys Val Val Ala Leu Leu Ala Thr Ala Val Glu 

100 105 110 

His Leu Val Thr Gly Glu Thr Met Glu He Thr Ser Ser Thr Glu Gin 
115 120 125 

35 Arg Tyr Ser Met Asp Tyr Tyr Met Gin Lys Thr Tyr Tyr Lys Thr Ala 
130 135 140 

Ser Leu He Ser Asn Ser Cys Lys Ala Val Ala Val Leu Thr Gly Gin 
145 150 155 160 

Thr Ala Glu Val Ala Val Leu Ala Phe Glu Tyr Gly Arg Asn Leu Gly 
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165 170 175 

Leu Ala Phe Gin Leu lie Asp Asp He Leu Asp Phe Thr Gly Thr Ser 

180 185 190 

Ala Ser Leu Gly Lys Gly Ser Leu Ser Asp He Arg His Gly Val He 
5 195 200 205 

Thr Ala Pro He Leu Phe Ala Met Glu Glu Phe Pro Gin Leu Arg Glu 

210 215 220 

Val Val Asp Gin Val Glu Lys Asp Pro Arg Asn Val Asp He Ala Leu 
225 230 235 240 

10 Glu Tyr Leu Gly Lys Ser Lys Gly He Gin Arg Ala Arg Glu Leu Ala 

245 250 255 

Met Glu His Ala Asn Leu Ala Ala Ala Ala He Gly Ser Leu Pro Glu 

260 265 270 

Thr Asp Asn Glu Asp Val Lys Arg Ser Arg Arg Ala Leu He Asp Leu 
15 275 280 285 

Thr His Arg Val He Thr Arg Asn Lys 
290 295 

* 

<210> 10 
20 <211> 561 
<212> DNA 

<213> Arabidopsis sp 
<400> 10 

25 aagcgcatcc gtcctcttct acgattgccg ccagccgcat gtatggctgc ataaccgacc 60 
gcccctatcc gctcgcggcc gcggtcgaat tcattcacac cgcgacgctg ctgcatgacg 120 
acgtcgtcga tgaaagcgat ttgcgccgcg gccgcgaaag cgcgcataag gttttcggca 180 
atcaggcgag cgtgctcgtc ggcgatttcc ttttctcccg cgccttccag ctgatggtgg 240 
aagacggctc gctcgacgcg ctgcgcattc tctcggatgc ctccgccgtg atcgcgcagg 300 

30 gcgaagtgat gcagctcggc accgcgcgca atcttgaaac caatatgagc cagtatctcg 360 
atgtgatcag cgcgaagacc gccgcgctct ttgccgccgc ctgcgaaatc ggcccggtga 420 
tggcgaacgc gaaggcggaa gatgctgccg cgatgtgcga atacggcatg aatctcggta 480 
tcgccttcca gatcatcgac gaccttctcg attacggcac cggcggccac gccgagcttg 540 
gcaagaacac gggcgacgat t 561 



35 



<210> 11 
<211> 966 
<212> DNA 

<213> Arabidopsis sp 



10 
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<400> 11 

atggtacttg ccgaggttcc aaagcttgcc tctgctgctg agtacttctt caaaaggggt 60 

gtgcaaggaa aacagtttcg ttcaactatt ttgctgctga tggcgacagc tctgaatgta 120 

5 cgcgttccag aagcattgat tggggaatca acagatatag tcacatcaga attacgcgta 180 

aggcaacggg gtattgctga aatcactgaa atgatacacg tcgcaagtct actgcacgat 240 

gatgtcttgg atgatgccga tacaaggcgt ggtgttggtt ccttaaatgt tgtaatgggt 300 

aacaagatgt egg tat t age aggagacttc ttgctctccc gggcttgtgg ggctctcgct 360 

gctttaaaga acacagaggt tgtagcatta ettgeaactg ctgtagaaca tcttgttacc 420 

10 ggtgaaacca tggaaataac tagttcaacc gagcagegtt atagtatgga ctactacatg 480 

cagaagacat attataagac ageategcta atctctaaca getgeaaage tgttgccgtt 540 

ctcactggac aaacagcaga agttgccgtg ttagcttttg agtatgggag gaatctgggt 600 

ttagcattcc aattaataga cgacattctt gatttcaegg gcacatctgc ctctctcgga 660 

aagggatcgt tgtcagatat tcgccatgga gtcataacag ccccaatcct etttgecatg 720 

15 gaagagtttc ctcaactacg cgaagttgtt gatcaagttg aaaaagatcc taggaatgtt 780 

gaeattgett tagagtatct tgggaagagc aagggaatac agagggcaag agaattagee 840 

atggaacatg cgaatctagc ageagctgea ategggtetc tacctgaaac agacaatgaa 900 

■ 

gatgtcaaaa gatcgaggcg ggcacttatt gacttgaccc atagagtcat caccagaaac 960 

aagtga 966 



20 



25 



<210> 12 
<211> 321 
<212> PRT 

<213> Arabidopsis sp 



<400> 12 

Met Val Leu Ala Glu Val Pro Lys Leu Ala Ser Ala Ala Glu Tyr Phe 

1 5 10 15 

Phe Lys Arg Gly Val Gin Gly Lys Gin Phe Arg Ser Thr lie Leu Leu 
30 20 25 30 

Leu Met Ala Thr Ala Leu Asn Val Arg Val Pro Glu Ala Leu lie Gly 

35 40 45 

Glu Ser Thr Asp lie Val Thr Ser Glu Leu Arg Val Arg Gin Arg Gly 
50 55 60 

35 He Ala Glu He Thr Glu Met He His Val Ala Ser Leu Leu His Asp 
65 70 75 80 

Asp Val Leu Asp Asp Ala Asp Thr Arg Arg Gly Val Gly Ser Leu Asn 

85 90 95 

Val Val Met Gly Asn Lys Met Ser Val Leu Ala Gly Asp Phe Leu Leu 
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100 105 110 i 

Ser Arg Ala Cys Gly Ala Leu Ala Ala Leu Lys Asn Thr' Glu Val Val 

115 120 125 

Ala Leu Leu Ala Thr Ala Val Glu His Leu Val Thr Gly Glu Thr Met 
5 130 135 140 

Glu lie Thr Ser Ser Thr Glu Gin Arg Tyr Ser Met Asp Tyr Tyr Met 
145 150 155 160 

Gin Lys Thr Tyr Tyr Lys Thr Ala Ser Leu lie Ser Asn Ser Cys Lys 

165 170 175 

10 Ala Val Ala Val Leu Thr Gly Gin Thr Ala Glu Val Ala Val Leu Ala 

180 185 190 

Phe Glu Tyr Gly Arg Asn Leu Gly Leu Ala Phe Gin Leu He Asp Asp 

195 200 205 

He Leu Asp Phe Thr Gly Thr Ser Ala Ser Leu Gly Lys Gly Ser Leu 
15 210 215 220 

Ser Asp He Arg His Gly Val He Thr Ala Pro He Leu Phe Ala Met 
225 230 235 240 

Glu Glu Phe Pro Gin Leu Arg Glu Val Val Asp Gin Val Glu Lys Asp 

245 250 255 

20 Pro Arg Asn Val Asp He Ala Leu Glu Tyr Leu Gly Lys Ser Lys Gly 

260 265 270 

He Gin Arg Ala Arg Glu Leu Ala Met Glu His Ala Asn Leu Ala Ala 

275 280 285 * 

Ala Ala He Gly Ser Leu Pro Glu Thr Asp Asn Glu Asp Val Lys Arg 
25 290 295 300 

Ser Arg Arg Ala Leu He Asp Leu Thr His Arg Val He Thr Arg Asn ^ 

305 310 315 320 

Lys 



30 



35 



<210> 13 
<211> 621 
<212> DNA 

<213> Arabidopsis sp 
<400> 13 

gctttctcct ttgctaattc ttgagctttc ttgatcccac cgcgatttct aactatttca 60 
atcgcttctt caagcgatcc aggctcacaa aactcagact caatgatctc tcttagcctt 120 
ggctcattct ctagcgcgaa gatcactggc gccgttatgt tacctttggc taagtcatta 180 



12 
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gctgcaggct tacctaactg ctctgtggac 
tgaaaagata aaccgagatt cttcccgaac 
actttactga aaattgctgc tcctttggtg 
taactcttta gcatgtagtc atcaagcttg 
5 tttatctcac cgcttgcaaa atctttgatc 
caaatgttct ttgtattgag tagcttcatc 
tgagcttaat gacttcaagg ttttcgagat 
aaatccccag ctaatacagc t 

10 <210> 14 
<211> 741 
<212> DNA 

<213> Arabidopsis sp 

15 <400> 14 

ggtgagtttt gttaatagtt atgagattca 
gtttaaactc tgtgtataat tgcaggaaag 
gtagcggtgc tagctggaga tttcatgttt 
gagaatcttg aagttattaa gctcatcagt 
* 20 ctatgaggtt gagctatgaa tctcatttcg 
catgttttca ggtgatcaaa gactttgcaa 
ttgactgcga caccaagctc gacgagtact 
tagtggctgc gagcaccaaa ggagctgcca 
aacaaatgta cgagtttggg aagaatctcg 

25 tggatttcac tcagtcgaca gagcagctcg 
gtaacttaac agcacctgtg attttcgctc 
ttgagtcaaa gttctgtgag gcgggttctc 
gtggggggat taagagagca c 

30 <210> 15 

<211> 1087 
<212> DNA 

<213> Arabidopsis sp 

35 <400> 15 

cctcttcagc caatccagag gaagaagaga 
aaaacgcacg gttttatgct ctctcttctg 
ttcaaccaga gggaaaaagc aacgataaca 
tccgcaaagc cgagtctgta aatgcggctc 
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tgagtgaagt ccagaatgtc 


atcaactact 


240 


tgatacattt gctctgcgac 


cttgctttcg 


300 


cttgcagcta ctaatgaagc 


tgtcttgtag 


360 


acatcacaat cgaataaact 


cgatgcttgc 


420 


acctgcaaaa agataaatca 


agattcagac 


480 


taatctcaga aaggaatatt 


acctgactta 


540 


ttgtaagtac catgatgctt 


gagcaacatg 


600 






621 


tctatttttg tcataaaatt 


• 

gtttggtttg 


60 


gaaacagttc atgagctttt 


cggcacaaga 


120 


gctcaagcgt catggtactt 


agcaaatctc 


180 


caggtactta gttactctta 


cattgttttt 


240 


ttgaataatg ctgtgcctca 


aacttttttt 


300 


gcggagagat aaagcaggcg 


tccagcttat 


360 


tactcaaaag tttctacaag 


acagcctctt 


420 


ttttcagcag agttgagcct 


gatgtgacag 


480 


gtctctcttt ccagatagtt 


gatgatattt 


540 


ggaagccagc agggagtgat 


ttggctaaag 


600 


tggagaggga gccaaggcta 


agagagatca 


660 


tggaagaagc gattgaagcg 


gtgacaaaag 


720 


* 




741 


caacttttta tctttcgtca 


agagtctccg 


60 


ccctcacctc acaagacgca 


gggcacatga 


120 


actctgcttt tgatttcaag 


ctgtatatga 


180 


tcgacgtttc cgtaccgctt 


ctgaaacccc 


240 
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ttacgatcca agaagcggtc aggtactctt 
tgctctgcat tgccgcttgt gagcttgtgg 
cttgcgcggt cgagatgatc cacacaagct 
acaatgccga cctccgtaga ggcaagccca 
5 gaaggctcag agataatgct gaactagtgt 
ggagaagaca tggcggtttt ggcaggtgat 
acggttgtgt cgagtgggtt ggtcgctccc 
gccagggcca tagggactac agggctagtt 
agactgaatc cagacaaggt tggattggag 

10 gcggcattgt tggaggcagc ggcagtttta 
gaaatcgaaa agcttagaaa gtatgctagg 
gacattctcg acgtaacaaa atctactgag 
atggccggaa agctgacgta tccaaggctg 
gagcacctga ggagagaagc agaggaaaag 

15 cctctgg 

<210> 16 
<211> 1164 
<212> DNA 
20 <213> Arabidopsis sp 

<400> 16 

atgacttcga ttctcaacac tgtctccacc 
cgagtcggag tcctctctct tcggaattcg 

25 ggtttctcga cgttgatcta cgaatcaccc 
actgatactg ataaagttaa atctcagaca 
attaaccagc ttctcggtat caaaggagca 
cttcagctta caaaaccagt cacttggcct 
gctgcttcag ggaactttca ttggacccca 

30 atgatgtctg gtccttgtct tactggctat 
gatatcgacg caattaatga gccatatcgt 
gaggttatta cacaagtctg ggtgctatta 
gatgtgtggg cagggcatac cactcccact 
ctatcttata tatactctgc tccacctctt 

35 tttgcacttg gagcaagcta tattagtttg 
actcttacgc cagatgttgt tgttctaaca 
gccattgtta acgacttcaa aagtgttgaa 
ccagtagctt ttggcaccga aactgcaaaa 
cagctttctg ttgccggata tctattagca 
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tgctagccgg cggaaaacgt gtgaggcctc 300 

ggggcgacga ggctactgcc atgtcagccg 360 

ctctcattca tgacgatctt ccgtgcatgg 420 

ccaatcacaa ggtatgttgt ttaattatat 480 

* 

tgaaccaatt tttgctcaaa caaggtatat 540 

gcactccttg cattggcgtt tgagcacatg 600 

gagaagatga ttcgcgccgt ggttgagctg 660 

gctggacaaa tgatagacct agccagcgaa 720 

catctagagt tcatccatct ccacaaaacg 780 

ggggttataa tgggaggtgg aacagaggaa 840 

tgtattggac tactgtttca ggttgttgat 900 

gaattgggta agacagccgg aaaagacgta 960 

ataggtttgg agggatccag ggaagttgca 1020 

cttaaagggt ttgatccaag tcaggcggcg 1080 

1087 



atccactctt ccagagttac ' ctccgtcgat 60 

gattccgttg agttcactcg ccggcgttct 120 

gggcggagat ttgttgtgcg tgcggcggag 180 

cctgacaagg caccagccgg tggttcaagc 240 

tctcaagaaa ctaataaatg gaagattcgt 300 

ccactggttt ggggagtcgt ctgtggtgct 360 

gaggatgttg ctaagtcgat tctttgcatg 420 

acacagacaa tcaacgactg gtatgataga 480 

ccaattccat ctggagcaat atcagagcca 540 

ttgggaggtc ttggtattgc tggaatatta 600 

gtcttctatc ttgctttggg aggatcattg 660 

aagctaaaac aaaatggatg ggttggaaat 720 

ccatggtggg ctggccaagc attgtttggc 780 

ctcttgtaca gcatagctgg gttaggaata 840 

ggagatagag cattaggact tcagtctctc 900 

tggatatgcg ttggtgctat agacattact 960 

tctgggaaac cttattatgc gttggcgttg 1020 

14 



WO 01/79472 PCT/US01/12334 

■ 

gttgctttga tcattcctca gattgtgttc cagtttaaat actttctcaa ggaccctgtc 1080 

aaatacgacg tcaagtacca ggcaagcgcg cagccattct tggtgctcgg aatatttgta 1140 

acggcattag catcgcaaca ctga 1164 

5 <210> 17 
<211> 387 
<212> PRT 

<213> Arabidopsis sp 
10 <400> 17 

Met Thr Ser lie Leu Asn Thr Val Ser Thr lie His Ser Ser Arg Val 

15 10 15 

Thr Ser Val Asp Arg Val Gly Val Leu Ser Leu Arg Asn Ser Asp Ser 

20 25 30 

15 Val Glu Phe Thr Arg Arg Arg Ser Gly Phe Ser Thr Leu He Tyr Glu 

35 40 45 

Ser Pro Gly Arg Arg Phe Val Val Arg Ala Ala Glu Thr Asp Thr Asp 

50 55 60 

Lys Val Lys Ser Gin Thr Pro Asp Lys Ala Pro Ala Gly Gly Ser Ser 
20 65 70 75 80 

He Asn Gin Leu Leu Gly He Lys Gly Ala Ser Gin Glu Thr Asn Lys 

85 90 95 

. Trp Lys He Arg Leu Gin Leu Thr Lys Pro Val Thr Trp Pro Pro Leu 

100 105 110 

25 Val Trp Gly Val Val Cys Gly Ala Ala Ala Ser Gly Asn Phe His Trp 

115 120 125 

Thr Pro Glu Asp Val Ala Lys Ser He Leu Cys Met Met Met Ser Gly 

130 135 140 

Pro Cys Leu Thr Gly Tyr Thr Gin Thr He Asn Asp Trp Tyr Asp Arg 
30 145 150 155 160 

Asp He Asp Ala He Asn Glu Pro Tyr Arg Pro He Pro Ser Gly Ala 

165 170 175 

He Ser Glu Pro Glu Val He Thr Gin Val Trp Val Leu Leu Leu Gly 

180 185 190 

35 Gly Leu Gly He Ala Gly He Leu Asp Val Trp Ala Gly His Thr Thr 

195 200 205 

Pro Thr Val Phe Tyr Leu Ala Leu Gly Gly Ser Leu Leu Ser Tyr He 

210 215 220 

Tyr Ser Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Val Gly Asn 



15 



i 



WO 01/79472 PCT/US01/12334 

225 230 235 240 

Phe Ala Leu Gly Ala Ser Tyr lie Ser Leu Pro Trp Trp Ala Gly Gin 

245 250 255 

Ala Leu Phe Gly Thr Leu Thr Pro Asp Val Val Val Leu Thr Leu Leu 
5 260 265 270 

Tyr Ser lie Ala Gly Leu Gly lie Ala lie Val Asn Asp Phe Lys Ser 

275 280 . 285 

Val Glu Gly Asp Arg Ala Leu Gly Leu Gin Ser Leu Pro Val Ala Phe 
290 295 - 300 

10 Gly Thr Glu Thr Ala Lys Trp lie Cys Val Gly Ala lie Asp lie Thr 
305 310 315 320 

Gin Leu Ser Val Ala Gly Tyr Leu Leu Ala Ser Gly Lys Pro Tyr Tyr 

325 330 335 

Ala Leu Ala Leu Val Ala Leu lie lie Pro Gin lie Val Phe Gin Phe 
15 340 345 350 

Lys Tyr Phe Leu Lys Asp Pro Val Lys Tyr Asp Val Lys Tyr Gin Ala 

355 360 365 

Ser Ala Gin Pro Phe Leu Val Leu Gly lie Phe Val Thr Ala Leu Ala 
370 375 380 

20 Ser Gin His 
385 

<210> 18 
<211> 981 
25 <212> DNA 

<213> Arabidopsls sp 



<400> 18 

atgttgttta gtggttcagc gatcccatta agcagcttct gctctcttcc ggagaaaccc 60 

3 0 cacactcttc ctatgaaact ctctcccgct gcaatccgat cttcatcctc atctgccccg 120 

gggtcgttga acttcgatct gaggacgtat tggacgactc tgatcaccga gatcaaccag 180 

aagctggatg aggccatacc ggtcaagcac cctgcgggga tctacgaggc tatgagatac 240 

tctgtactcg cacaaggcgc caagcgtgcc cctcctgtga tgtgtgtggc ggcctgcgag 300 

ctcttcggtg gcgatcgcct cgccgctttc cccaccgcct gtgccctaga aatggtgcac 360 

35 gcggcttcgt tgatacacga cgacctcccc tgtatggacg acgatcctgt gcgcagagga 420 

aagccatcta accacactgt ctacggctct ggcatggcca ttctcgccgg tgacgccctc 480 

ttcccactcg ccttccagca cattgtctcc cacacgcctc ctgaccttgt tccccgagcc 540 

accatcctca gactcatcac tgagattgcc cgcactgtcg gctccactgg tatggctgca 600 

ggccagtacg tcgaccttga aggaggtccc tttcctcttt cctttgttca ggagaagaaa 660 
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ttcggagcca tgggtgaatg ctctgccgtg 
gatgagctcc agagtctccg aaggtacggg 
gatgacatca ccgaggacaa gaagaagagc 
gaaatggcgg aagagctcaa ggagaaggcg 
5 tatggaggag gagacacact tgttcctctc 
cattttcttc ttcccctctg a 

<210> 19 
<211> 245 
10 <212> DNA 

<213> GLycine sp 

<400> 19 

gcaacatctg ggactgggtt tgtcttgggg 
15 tcttgcactt gcttgggtac catgatggtt 
tttgagatca ataatgatgc taaaatgaag 
cgcatcacaa tacctcatgc agttggctgg 
ctact 

20 <210> 20 

<211> 253 

<212> DNA 

<213> Glycine sp 

25 <400> 20 

attggctttc caagatcatt gggttttctt 
ttggcattgt ccaaggatat acctgacgtt 
tttgcagtac gtctaggtca gaaacgggca 
gctttcggag ttggtatcct ggccggagca 

30 acgggtatgg gaa 

<210> 21 
<211> 275 
<212> DNA 
35 <213> Glycine sp 

<400> 21 

tgatcttcta ctctctgggt atggcattgt 

• ■ 

aagcatacgg catcgatact ttagcgatac 



PCT/US01/12334 



tgcggtggcc 


tattgggcgg 


tgccactgag 


720 


agagccgtcg 


ggatgctgta 


tcaggtggtc 


780 


tatgatggtg 


gagcagagaa gggaatgatg 


840 


aagaaggagc 


ttcaagtgtt 


tgacaacaag 


900 


tacaccttcg 


ttgactacgc 


tgctcatcga 


960 




■ 


• 


981 


agtggtagtg 


ctgttgatct 


ttcggcactt 


60 


gctgcatctg 


ctaactcttt 


gaatcaggtg 


120 


agaacaagtc 


gcaggccact 


accctcagga 


180 


gcatcctctg 


ttggattagc 


tggtacggct 


240 






• 


245 


gttgcattca 


tgaccttcta 


ctccttgggt 


60 


gaaggagata 


aagagcacgg 


cattgattct 


120 


ttttggattt 


gcgtttcctt 


ttttgaaatg 


180 


tcatgctcac 


acttttggac 


taaaattttc 


240 








253 


ccaaggatat 


atctgacgtt 


aaaggagata 


60 


gtttgggtca 


aaaatgggta 


ttttggattt 


120 
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gcattatcct ttttgaaatg gcttttggag ttgccctctt ggcaggagca acatcttctt 180 
acctttggat taaaattgtc acgggtctgg gacatgctat tcttgcttca attctcttgt 240 
accaagccaa atctatatac ttgagcaaca aagtt 275 

5 <210> 22 
<211> 299 
<212> DNA 

<213> Glycine sp 

10 <220> 

<221> misc feature 
<222> (1) . . . (299) 

<223> n = A,T,C or G 

- 

15 <400> 22 

ccanaatang tncatcttng aaagacaatt ggcctcttca acacacaagt ctgcatgtga 60 
agaagaggcc aattgtcttt ccaagatcac ttatngtggc tattgtaatc atgaacttct 120 
tctttgtggg tatggcattg gcaaaggata* tacctanctg ttgaaggaga taaaatatat 180 
ggcattgata cttttgcaat acgtataggt caaaaacaag tattttggat ttgtattttc 240 
20 ctttttgaaa ggctttcgga gtttccctag tggcaggagc aacatcttct agccttggt 299 

<210> 23 

<211> 767 

<212> DNA 

25 <213> Glycine sp 

<400> 23 

gtggaggctg tggttgctgc cctgtttatg aatatttata ttgttggttt gaatcaattg 60 

tctgatgttg aaatagacaa gataaacaag ccgtatcttc cattagcatc tggggaatat 120 

30 tcctttgaaa ctggtgtcac tattgttgca tctttttcaa ttctgagttt ttggcttggc 180 

tgggttgtag gttcatggcc attattttgg gccctttttg taagctttgt gctaggaact 240 

gcttattcaa tcaatgtgcc tctgttgaga tggaagaggt ttgcagtgct tgcagcgatg 300 

tgcattctag ctgttcgggc agtaatagtt caacttgcat ttttccttca catgcagact 360 

catgtgtaca agaggccacc tgtcttttca agaccattga tttttgctac tgcattcatg 420 

35 agcttcttct ctgtagttat agcactgttt aaggatatac ctgacattga aggagataaa 480 

gtatttggca tccaatcttt ttcagtgtgt ttaggtcaga agccggtgtt ctggacttgt 540 

gttacccttc ttgaaatagc ttatggagtc gccctcctgg tgggagctgc atctccttgt 600 

ctttggagca aaattttcac gggtctggga cacgctgtgc tggcttcaat tctctggttt 660 

catgccaaat ctgtagattt gaaaagcaaa gcttcgataa catccttcta tatgtttatt 720 
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tggaagctat tttatgcaga atacttactc attccttttg ttagatg 767 

<210> 24 
<211> 255 
5 <212> PRT 

<213> Glycine sp 

<400> 24 

Val Glu Ala Val Val Ala Ala Leu Phe Met Asn lie Tyr lie Val Gly 
10 1 5 10 15 

Leu Asn Gin Leu Ser Asp Val Glu lie Asp Lys He Asn Lys Pro Tyr 

20 25 30 

Leu Pro Leu Ala Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr He 
35 40 45 

15 Val Ala Ser Phe Ser He Leu Ser Phe Trp Leu Gly Trp Val Val Gly 
50 55 60 

Ser Trp Pro Leu Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr 
65 70 75 80 

Ala Tyr Ser He Asn Val Pro Leu Leu Arg Trp Lys Arg Phe Ala Val 
20 85 90 95 

Leu Ala Ala Met Cys He Leu Ala Val Arg Ala Val He Val Gin Leu 

100 105 110 

Ala Phe Phe Leu His Met Gin Thr His Val Tyr Lys Arg Pro Pro Val 
115 120 125 

25 Phe Ser Arg Pro Leu He Phe Ala Thr Ala Phe Met Ser Phe Phe Ser 
130 135 140 

Val Val He Ala Leu Phe Lys Asp He Pro Asp He Glu Gly Asp Lys { 
145 150 155 160 

Val Phe Gly He Gin Ser Phe Ser Val Cys Leu Gly Gin Lys Pro Val 
30 165 170 175 

Phe Trp Thr Cys Val Thr Leu Leu Glu He Ala Tyr Gly Val Ala Leu 

180 185 190 

Leu Val Gly Ala Ala Ser Pro Cys Leu Trp Ser Lys He Phe Thr Gly 
195 200 205 

35 Leu Gly His Ala Val Leu Ala Ser He Leu Trp Phe His Ala Lys Ser 
210 215 220 

Val Asp Leu Lys Ser Lys Ala Ser He Thr Ser Phe Tyr Met Phe He 
225 230 235 240 

Trp Lys Leu Phe Tyr Ala Glu Tyr Leu Leu He Pro Phe Val Arg 
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245 



250 
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255 



<210> 25 
<2ll> 360 
5 <212> DNA 

<213> Zea sp 



<220> 

<221> misc_feature 
10 <222> {!)... (360) 

<223> n = A,T,C or 6 



<400> 25 



ggcgtcttca 


cttgttctgg 


tcttctcgta tcccctgatg aagaggttca cattttggcc 


60 


tcaggcttat 


cttggcctga 


cattcaactg gggagcttta ctagggtggg 


ctgctattaa 


120 


ggaaagcata 


gaccctgcaa 


atcatccttc cattgtatac agctggtatt 


tgttggacgc 


180 


tggtgtatga 


tactatatat 


gcgcatcagg tgtttcgcta tccctacttt 


catattaatc 


240 


cttgatgaag 


tggccatttc 


atgttgtcgc ggtggtctta tacttgcata 


tctccatgca 


300 


tctcaggaca 


aagangatga 


cctgaaagta ggagtccaag tccacagctt 


aagatttggg 


360 



20 

<210> 26 
<211> 299 
<212> DNA 
<213> Zea sp 

25 

<220> 

<221> misc_f eature 
<222> (1) . . . (299) 
<223> n = A,T,C or G 

30 

<400> 26 

gatggttgca gcatctgcaa ataccctcaa ccaggtgttt gngataaaaa atgatgctaa ' 60 
aatgaaaagg acaatgcgtg ccccctgcca tctggtcgca ttagtcctgc acatgctgcg 120 
atgtgggcta caagtgttgg agttgcagga acagctttgt tggcctggaa ggctaatggc 180 
35 ttggcagctg ggcttgcagc ttctaatctt gttctgtatg catttgtgta tacgccgttg 240 
aagcaaatac accctgttaa tacatgggtt ggggcagtcg ttggtgccat cccaccact 299 



<210> 27 



<211> 255 



20 
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<212> DNA 
<213> Zea sp 

<220> 

5 <221> misc_f eature 
<222> (1) . . . (255) 
<223> n = A,T,C or G 

<400> 27 

10 anacttgcat atctccatgc ntctcaggac 
tccacagcat taagatttgg agatttgacc 
tgcttcggca gcttagcact cagtggttac 
tgcttgagcg aagaatggta tngtttttac 
tggacagggt ggccc 

15 

<210> 28 
<211> 257 
<212> DNA 
<213> Zea sp 

20 

<400> 28 

attgaagggg ataggactct ggggcttcag 
gcaaaatgga tttgtgttgg agcaattgat 
ttgagcaccg gtaagctgta ttatgccctg 
25 ttctttcagt tccagtactt cctgaaggac 
agcgcacaac cattctt 

<210> 29 
<211> 368 
30 <212> DNA 

<213> Zea sp 

<400> 29 

atccagttgc aaataataat ggcgttcttc 
35 cctgacatcg aaggggaccg catattcggg 
aagaaggtct tttggatctg cgttggcttg 
atgggagcta cctcttcctg tttgtggagc 
cttgccgcga tcctatggag ctgcgcgcga 
acgtccttct acatgttcat ctggaagctg 



PCT/USO 1/12334 



aaagangatg 


acctgaaagt 


aggtgtcaag 


60 


nnatactgna 


tcagtggctt 


tggcgcggca 


120 


aatgctgacc 


ttggttggtg 


tttagtgtga 


180 


ttgatattga 


ctccagacct 


gaaatcatgt 


240 








255 


tcacttcctg 


ttgcttttgg 


gatggaaact 


60 


atcactcaat 


tatctgttgc 


aggttaccta 


120 


gtgttgcttg 


ggctaacaat 


tcctcaggtg 


180 


cctgtgaagt 


atgatgtcaa 


atatcaggca 


240 






• 


257 


tctgttgtaa 


tagcactatt 


• 

caaggatata 


60 


atccgatcct 


tcagcgtccg 


gttagggcaa 


120 


cttgagatgg 


cctacagcgt 


tgcgatactg 


180 


aaaacagcaa 


ccatcgctgg 


ccattccata 


240 


tcggtggact 


tgacgagcaa 


agccgcaata 


300 


ttctacgcgg 


agtacctgct 


catccctctg 


360 
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gtgcggtg 368 

<210> 30 
<211> 122 
5 <212> PRT 

<213> Zea sp 



<400> 30 

He Gin Leu Gin He He Met Ala Phe Phe Ser Val Val He Ala Leu 
10 1 5 10 15 

Phe Lys Asp He Pro Asp He Glu Gly Asp Arg He Phe Gly He Arg 

20 ■ 25 30 

Ser Phe Ser Val Arg Leu Gly Gin Lys Lys Val Phe Trp He Cys Val 
35 40 45 

15 Gly Leu Leu Glu Met Ala Tyr Ser Val Ala He Leu Met Gly Ala Thr 
50 55 60 

Ser Ser Cys Leu Trp Ser Lys Thr Ala Thr He Ala Gly His Ser He 
65 70 75 80 

Leu Ala Ala He Leu Trp Ser Cys Ala Arg Ser Val Asp Leu Thr Ser 
20 85 90 95 

Lys Ala Ala He Thr Ser Phe Tyr Met Phe He Trp Lys Leu Phe Tyr 

100 105 110 

Ala Glu Tyr Leu Leu He Pro Leu Val Arg 
115 120 



25 



30 



<210> 31 
<211> 278 
<212> DNA 
<213> Zea sp 



<400> 31 

tattcagcac cacctctcaa gctcaagcag aatggatgga ttgggaactt cgctctgggt 60 

gcgagttaca tcagcttgcc ctggtgggct ggccaggcgt tatttggaac tcttacacca 120 

gatatcattg tcttgactac tttgtacagc atagctgggc tagggattgc tattgtaaat 180 

35 gatttcaaga gtattgaagg ggataggact ctggggcttc agtcacttcc tgttgctttt 240 

gggatggaaa ctgcaaaatg gatttgtgtt ggagcaat 278 

i 

4 

<210> 32 
<211> 292 
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<212> PRT 

<213> Synechocystis sp 
<400> 32 

5 Met Val Ala Gin Thr Pro Ser Ser Pro Pro Leu Trp Leu Thr lie lie 
15 10 15 

Tyr Leu Leu Arg Trp His Lys Pro Ala Gly Arg Leu lie Leu Met lie 

20 25 30 

Pro Ala Leu Trp Ala Val Cys Leu Ala Ala Gin Gly Leu Pro Pro Leu 
10 35 40 45 

Pro Leu Leu Gly Thr He Ala Leu Gly Thr Leu Ala Thr Ser Gly Leu 

50 55 60 

Gly Cys Val Val Asn Asp Leu Trp Asp Arg Asp He Asp Pro Gin Val 
65 70 75 80 

15 Glu Arg Thr Lys Gin Arg Pro Leu Ala Ala Arg Ala Leu Ser Val Gin 

85 90 95 

Val Gly He Gly Val Ala Leu Val Ala Leu Leu Cys Ala Ala Gly Leu 

100 105 110 

Ala Phe Tyr Leu Thr Pro Leu Ser Phe Trp Leu Cys Val Ala- Ala Val 
20 115 120 125 

Pro Val He Val Ala Tyr Pro Gly Ala Lys Arg Val Phe Pro Val Pro 

130 135 140 

Gin Leu Val Leu Ser He Ala Trp Gly Phe Ala Val Leu He Ser Trp 
145 150 155 160 

25 Ser Ala Val Thr Gly Asp Leu Thr Asp Ala Thr Trp Val Leu Trp Gly 

165 170 175 

Ala Thr Val Phe Trp Thr Leu Gly Phe Asp Thr Val Tyr Ala Met Ala 

180 185 190 

Asp Arg Glu Asp Asp Arg Arg He Gly Val Asn Ser Ser Ala Leu Phe 
30 195 200 205 

Phe Gly Gin Tyr Val Gly Glu Ala Val Gly lie Phe Phe Ala Leu Thr 

210 215 220 

lie Gly Cys Leu Phe Tyr Leu Gly Met lie Leu Met Leu Asn Pro Leu 
225 230 235 240 

35 Tyr Trp Leu Ser Leu Ala He Ala lie Val Gly Trp Val lie Gin Tyr 

245 250 255 

lie Gin Leu Ser Ala Pro Thr Pro Glu Pro Lys Leu Tyr Gly Gin lie 

260 265 270 

Phe Gly Gin Asn Val lie lie Gly Phe Val Leu Leu Ala Gly Met Leu 
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* 

275 280 285 

Leu Gly Trp Leu 
290 



5 <210> 33 

<211> 316 

* <212> PRT 

<213> Synechocystis sp 

10 <400> 33 

Met Val Thr Ser Thr Lys lie His Arg Gin His Asp Ser Met Gly Ala 

1 5 10 15 

Val Cys Lys Ser Tyx Tyr Gin Leu Thr Lys Pro Arg lie He Pro Leu 

20 25 30 

15 Leu Leu He Thr Thr Ala Ala Ser Met Trp lie Ala Ser Glu Gly Arg 

35 40 45 

Val Asp Leu Pro Lys Leu Leu He Thr Leu Leu Gly Gly Thr Leu Ala 

50 55 60 

Ala Ala Ser Ala Gin Thr Leu Asn Cys He Tyr Asp Gin Asp He Asp 
20 65 70 75 80 

Tyr Glu Met Leu Arg Thr Arg Ala Arg Pro He Pro Ala Gly Lys Val 

85 90 95 

Gin Pro Arg His Ala Leu lie Phe Ala Leu Ala Leu Gly Val Leu Ser 

100 105 110 

25 Phe Ala Leu Leu Ala Thr Phe Val Asn Val Leu Ser Gly Cys Leu Ala 

115 120 125 

Leu Ser Gly He Val Phe Tyr Met Leu Val Tyr Thr His Trp Leu Lys 

130 135 140 

Arg His Thr Ala Gin Asn He Val He Gly Gly Ala Ala Gly Ser He 
30 145 150 155 160 

Pro Pro Leu Val Gly Trp Ala Ala Val Thr Gly; Asp Leu Ser Trp Thr 

165 170 175 

Pro Trp Val Leu Phe Ala Leu He Phe Leu Trp Thr Pro Pro His Phe 

180 185 190 

35 Trp Ala Leu Ala Leu Met He Lys Asp Asp Tyr Ala Gin Val Asn Val 

195 200 205 

Pro Met Leu Pro Val He Ala Gly Glu Glu Lys Thr Val Ser Gin He 

210 215 220 

Trp Tyr Tyr Ser Leu Leu Val Val Pro Phe Ser Leu Leu Leu Val Tyr 
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225 230 235 240 

Pro Leu His Gin Leu Gly lie Leu Tyr Leu Ala lie Ala lie lie Leu 

245 250 255 

Gly Gly Gin Phe Leu Val Lys Ala Trp Gin Leu Lys Gin Ala Pro Gly 
5 260 265 270 

Asp Arg Asp Leu Ala Arg Gly Leu Phe Lys Phe Ser lie Phe Tyr Leu 

275 280 285 

Met Leu Leu Cys Leu Ala Met Val lie Asp Ser Leu Pro Val Thr His 
290 295 300 

10 Gin Leu Val Ala Gin Met Gly Thr Leu Leu Leu Gly 
305 310 315 

<210> 34 
<211> 324 
15 <212> PRT 

<213> Synechocystis sp 

<400> 34 

Met Ser Asp Thr Gin Asn Thr Gly Gin Asn Gin Ala Lys Ala Arg Gin 
20 1 5 10 15 

Leu Leu Gly Met Lys Gly Ala Ala Pro Gly Glu Ser Ser lie Trp Lys 

20 25 30 

lie Arg Leu Gin Leu Met Lys Pro lie Thr Trp lie Pro Leu lie Trp 
35 40 45 

25 Gly Val Val Cys Gly Ala Ala Ser Ser Gly Gly Tyr ile Trp Ser Val 
50 55 60 

Glu Asp Phe Leu Lys Ala Leu Thr Cys Met Leu Leu Ser Gly Pro Leu 
65 70 75 80 

Met Thr Gly Tyr Thr Gin Thr Leu Asn Asp Phe Tyr Asp Arg Asp Ile 
3 0 85 90 95 

Asp Ala Ile Asn Glu Pro Tyr Arg Pro Ile Pro Ser Gly Ala Ile Ser 

100 105 110 

Val Pro Gin Val Val Thr Gin Ile Leu Ile Leu Leu Val Ala Gly lie 
115 120 125 

35 Gly Val Ala Tyr Gly Leu Asp Val Trp Ala Gin His Asp Phe Pro lie 
130 135 140 

Met Met Val Leu Thr Leu Gly Gly Ala Phe Val Ala Tyr Ile Tyr Ser 
145 150 155 160 

Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Leu Gly Asn Tyr Ala 



25 



WO 01/79472 



PCT/US01/12334 



Leu Gly Ala Ser 

180 

Phe Gly Thr Leu 
5 195 

Leu Ala Gly Leu 
210 

Gly Asp Arg Gin 
225 

10 Gly Thr Ala Ala 

Gly He Ala Gly 

260 

He Val Leu Leu 
15 275 

Phe Leu Arg Asn 
290 

Gin Pro Phe Leu 
305 

20 His Ala Gly He 



165 170 
Tyr He Ala Leu Pro Trp 

185 

Asn Pro Thr He Met Val 

200 

Gly He Ala Val Val Asn 
215 

Leu Gly Leu Lys Ser Leu 
230 

Trp He Cys Val He Met 
245 250 
Tyr Leu He Tyr Val His 

265 

Leu Leu He Pro Gin He 

280 

Pro Leu Glu Asn Asp Val 
295 

Val Phe Gly Met Leu Ala 
310 



175 

Trp Ala Gly His Ala Leu 

190 

Leu Thr Leu He Tyr Ser 
205 

Asp Phe Lys Ser Val Glu 
220 

Pro Val Met Phe Gly He 
235 240 
He Asp Val Phe Gin Ala 

255 

Gin Gin Leu Tyr Ala Thr 

270 

Thr Phe Gin Asp Met Tyr 
285 

Lys Tyr Gin Ala Ser Ala 
300 

Thr Gly Leu Ala Leu Gly 
315 320 



<210> 35 
<211> 307 
25 <212> PRT 

<213> Synechocystis sp 

<400> 35 

Met Thr Glu Ser Ser Pro Leu Ala Pro Ser Thr Ala Pro Ala Thr Arg 
30 1 5 10 15 

Lys Leu Trp Leu Ala Ala He Lys Pro Pro Met Tyr Thr Val Ala Val 

20 25 30 

Val Pro He Thr Val Gly Ser Ala Val Ala Tyr Gly Leu Thr Gly Gin 
35 40 45 

35 Trp His Gly Asp Val Phe Thr He Phe Leu Leu Ser Ala He Ala He 
50 55 60 

He Ala Trp He Asn Leu Ser Asn Asp Val Phe Asp Ser Asp Thr Gly 
65 70 75 80 

He Asp Val Arg Lys Ala His Ser Val Val Asn Leu Thr Gly Asn Arg 
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85 90 95 

Asn Leu Val Phe Leu lie Ser Asn Phe Phe Leu Leu Ala Gly Val Leu 

100 105 110 

Gly Leu Met Ser Met Ser Trp Arg Ala Gin Asp Trp Thr Val Leu Glu 
5 115 120 125 

Leu He Gly Val Ala He Phe Leu Gly Tyr Thr Tyr Gin Gly Pro Pro 

130 135 140 

Phe Arg Leu Gly Tyr Leu Gly Leu Gly Glu Leu He Cys Leu He Thr 
145 150 155 160 

10 Phe Gly Pro Leu Ala He Ala Ala Ala Tyr Tyr Ser Gin Ser Gin Ser 

165 170 175 

Phe Ser Trp Asn Leu Leu Thr Pro Ser Val Phe Val Gly He Ser Thr 

180 185 190 

Ala He He Leu Phe Cys Ser His Phe His Gin Val Glu Asp Asp Leu 
15 195 200 205 

Ala Ala Gly Lys Lys Ser Pro He Val Arg Leu Gly Thr Lys Leu Gly 

210 215 220 

Ser Gin Val Leu Thr Leu Ser Val Val Ser Leu Tyr Leu He Thr Ala 

* 

225 230 235 240 

20 He Gly Val Leu Cys His Gin Ala Pro Trp Gin Thr Leu Leu He He 

245 250 255 

Ala Ser Leu Pro Trp Ala Val Gin Leu He Arg His Val Gly Gin Tyr 

260 265 270 

His Asp Gin Pro Glu Gin Val Ser Asn Cys Lys Phe He Ala Val Asn 
25 275 280 285 

Leu His Phe Phe Ser Gly Met Leu Met Ala Ala Gly Tyr Gly Trp Ala 

290 295 300 

Gly Leu Gly 
305 



30 



35 



<210> 36 
<211> 927 
<212> DNA 

<213> Synechocystis sp 
<400> 36 

atggcaacta tccaagcttt ttggcgcttc tcccgccccc ataccatcat tggtacaact 60 
ctgagcgtct gggctgtgta tctgttaact attctcgggg atggaaactc agttaactcc 120 
cctgcttccc tggatttagt gttcggcgct tggctggcct gcctgttggg taatgtgtac 180 
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attgtcggcc tcaaccaatt gtgggatgtg 
cccctagcta acggagattt ttctatcgcc 
gttgcttcct tggcgatcgc ctggggatta 
agtttgatta ttggcacggc ctattcggtg 
5 ctggcggccc tgtgtattct gacggtgcgg 
ttttttagaa ttggtttagg ttatcccccc 
ttatttatct tagttttcac cgtggcgatc 
ggcgatcggc aatttaagat tcaaacttta 
cggggaacct taattttact cactggttgt 
10 gcggctatgc ctttaaatac tgctttcttg 
ctctggtggc ggagtcgaga tgtacactta 
cagtttattt ggaagctatt tttcttagag 
•cctaattttt ctaatactat tttttag 
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gacattgacc gcatcaataa 


gccgaatttg 


240 


cagggccgtt ggattgtggg 


actttgtggc 


300 


gggctatggc tggggctaac 


ggtgggcatt 


360 


ccgccagtga ggttaaagcg 


cttttccctg 


420 


ggaattgtgg ttaacttggg 


cttattttta 


480 


actttaataa cccccatctg 


ggttttgact 


540 


gccattttta aagatgtgcc 


agatatggaa 


600 


actttgcaaa tcggcaaaca 


aaacgttttt 


660 


tatttagcca tggcaatctg 


gggcttatgg 


720 


attgtttccc atttgtgctt 


attagcctta 


780 


gaaagcaaaa ccgaaattgc 


tagtttttat 


840 


tacttgctgt atcccttggc 


tctgtggtta 


900 






927 



15 <210> 37 
<211> 308 
<212> PRT 

<213> Synechocystis sp 
20 <400> 37 

Met Ala Thr He Gin Ala Phe Trp Arg Phe Ser Arg Pro His Thr He 

15 10 15 

He Gly Thr Thr Leu Ser Val Trp Ala Val Tyr Leu Leu Thr lie Leu 

20 25 30 

25 Gly Asp Gly Asn Ser Val Asn Ser Pro Ala Ser Leu Asp Leu Val Phe 

35 40 45 

Gly Ala Trp Leu Ala Cys Leu Leu Gly Asn Val Tyr He Val Gly Leu 

50 55 60 

Asn Gin Leu Trp Asp Val Asp He Asp Arg He Asn Lys Pro Asn Leu 
30 65 70 75 80 

Pro Leu Ala Asn Gly Asp Phe Ser He Ala Gin Gly Arg Trp He Val 

85 90 95 

Gly Leu Cys Gly Val Ala Ser Leu Ala He Ala Trp Gly Leu Gly Leu 

100 105 110 

35 Trp Leu Gly Leu Thr Val Gly He Ser Leu He He Gly Thr Ala Tyr 

115 120 * 125 

Ser Val Pro Pro Val Arg Leu Lys Arg Phe Ser Leu Leu Ala Ala Leu 

130 135 140 

Cys He Leu Thr Val Arg Gly He Val Val Asn Leu Gly Leu Phe Leu 
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145 150 155 160 

Phe Phe Arg lie Gly Leu Gly Tyr Pro Pro Thr Leu lie Thr Pro lie 

165 170 175 

Trp Val Leu Thr Leu Phe lie Leu Val Phe Thr Val Ala lie Ala He 
5 180 185 190 

Phe Lys Asp Val Pro Asp Met Glu Gly Asp Arg Gin Phe Lys He Gin 

195 200 205 

Thr Leu Thr Leu Gin He Gly Lys Gin Asn Val Phe Arg Gly Thr Leu 
210 215 220 

10 He Leu Leu Thr Gly Cys Tyr Leu Ala Met Ala He Trp Gly Leu Trp 
225 230 235 240 

Ala Ala Met Pro Leu Asn Thr Ala Phe Leu He Val Ser His Leu Cys 

245 250 255 

Leu Leu Ala Leu Leu Trp Trp Arg Ser Arg Asp Val His Leu Glu Ser 
15 260 265 270 

Lys Thr Glu He Ala Ser Phe Tyr Gin Phe He Trp Lys Leu Phe Phe 

275 280 285 

Leu Glu Tyr Leu Leu Tyr Pro Leu Ala Leu Trp Leu Pro Asn Phe Ser 
290 295 300 

20 Asn Thr He Phe 
305 



<210> 38 
<211> 1092 
25 <212> DNA 

<213> Synechocystis sp 

<400> 38 

atgaaatttc cgccccacag tggttaccat tggcaaggtc aatcaccttt ctttgaaggt 60 

30 tggtacgtgc gcctgctttt gccccaatcc ggggaaagtt ttgcttttat gtactccatc 120 

gaaaatcctg ctagcgatca tcattacggc ggcggtgctg tgcaaatttt agggccggct 180 

acgaaaaaac aagaaaatca ggaagaccaa cttgtttggc ggacatttcc ctcggtaaaa 240 

aaattttggg ccagtcctcg ccagtttgcc ctagggcatt ggggaaaatg tagggataac 300 

aggcaggcga aacccctact ctccgaagaa ttttttgcca cggtcaagga aggttatcaa 360 

35 atccatcaaa atcagcacca aggacaaatc attcatggcg atcgccattg tcgttggcag 420 

ttcaccgtag aaccggaagt aacttggggg agtcctaacc gatttcctcg ggctacagcg 480 

ggttggcttt cctttttacc cttgtttgat cccggttggc aaattctttt agcccaaggt 540 

agagcgcacg gctggctgaa atggcagagg gaacagtatg aatttgacca cgccctagtt 600 

tatgccgaaa aaaattgggg tcactccttt ccctcccgct ggttttggct ccaagcaaat 660 
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tattttcctg accatccagg actgagcgtc actgccgctg gcggggaacg gattgttctt 720 

ggtcgccccg aagaggtagc tttaattggc ttacatcacc aaggtaattt ttacgaattt 780 

ggcccgggcc atggcacagt cacttggcaa gtagctccct ggggccgttg gcaattaaaa 840 

gccagcaatg ataggtattg ggtcaagttg tccggaaaaa cagataaaaa aggcagttta 900 

5 gtccacactc ccaccgccca gggcttacaa ctcaactgcc gagataccac taggggctat 960 

ttgtatttgc aattgggatc tgtgggtcac ggcctgatag tgcaagggga aacggacacc 1020 

gcggggctag aagttggagg tgattggggt ttaacagagg aaaatttgag caaaaaaaca 1080 

gtgccattct ga 1092 

10 <210> 39 
<211> 363 
<212> PRT 

<213> Synechocystis sp 
15 <400> 39 

Met Lys Phe Pro Pro His Ser Gly Tyr His Trp Gin Gly Gin Ser Pro 

15 10 15 

Phe Phe Glu Gly Trp Tyr Val Arg Leu Leu Leu Pro Gin Ser Gly Glu 

20 25 30 

20 Ser Phe Ala Phe Met Tyr Ser lie Glu Asn Pro Ala Ser Asp His His 

35 40 45 

Tyr Gly Gly Gly Ala Val Gin lie Leu Gly Pro Ala Thr Lys Lys Gin 

50 55 60 

Glu Asn Gin Glu Asp Gin Leu Val Trp Arg Thr Phe Pro Ser Val Lys 
25 65 70 75 80 

Lys Phe Trp Ala Ser Pro Arg Gin Phe Ala Leu Gly His Trp Gly Lys 

85 90 95 

Cys Arg Asp Asn Arg Gin Ala Lys Pro Leu Leu Ser Glu Glu Phe Phe 

100 105 110 

3 0 Ala Thr Val Lys Glu Gly Tyr Gin He His Gin Asn Gin His Gin Gly 

115 120 125 

Gin He lie His Gly Asp Arg His Cys Arg Trp Gin Phe Thr Val Glu 

130 135 140 

Pro Glu Val Thr Trp Gly Ser Pro Asn Arg Phe Pro Arg Ala Thr Ala 
35 145 150 155 160 

Gly Trp Leu Ser Phe Leu Pro Leu Phe Asp Pro Gly Trp Gin He Leu 

165 ' 170 175 

Leu Ala Gin Gly Arg Ala His Gly Trp Leu Lys Trp Gin Arg Glu Gin 

180 185 190 
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Tyr Glu Phe Asp His Ala Leu Val Tyr Ala Glu Lys Asn Trp Gly His 

.195 200 205 

Ser Phe Pro Ser Arg Trp Phe Trp Leu Gin Ala Asn Tyr Phe Pro Asp 
210 215 220 

5 His Pro Gly Leu Ser Val Thr Ala Ala Gly Gly Glu Arg He Val Leu 
225 230 235 240 

Gly Arg Pro Glu Glu Val Ala Leu He Gly Leu His His Gin Gly Asn 

245 250 255 

Phe Tyr Glu Phe Gly Pro Gly His Gly Thr Val Thr Trp Gin Val Ala 
10 260 265 270 

Pro Trp Gly Arg Trp Gin Leu Lys Ala Ser Asn Asp Arg Tyr Trp Val 

275 280 285 

Lys Leu Ser Gly Lys Thr Asp Lys Lys Gly Ser Leu Val His Thr Pro 
290 295 300 

15 Thr Ala Gin Gly Leu Gin Leu Asn Cys Arg Asp Thr Thr Arg Gly Tyr 
305 310 315 320 

Leu Tyr Leu Gin Leu Gly Ser Val Gly His Gly Leu He Val Gin Gly 

325 330 335 

Glu Thr Asp Thr Ala Gly Leu Glu Val Gly Gly Asp Trp Gly Leu Thr 
20 340 345 350 

Glu Glu* Asn Leu Ser Lys Lys Thr Val Pro Phe 
355 360 

<210> 40 
25 <211> 56 
<212> DNA 

<213> Artificial Sequence 
<400> 40 

3 0 cgcgatttaa atggcgcgcc cfcgcaggcgg ccgcctgcag ggcgcgccat ttaaat 56 

<210> 41 
<211> 32 
<212> DNA 
35 <213> Artificial Sequence 

-<400> 41 

tcgaggatcc gcggccgcaa gcttcctgca gg 32 
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5 



<210> 42 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<400> 42 

tcgacctgca ggaagcttgc ggccgcggat cc 32 



<210> 43 
10 <211> 32 
<212> DNA 

<213> Artificial Sequence 
<400> 43 

15 tcgacctgca ggaagcttgc ggccgcggat cc 32 

<210> 44 
<211> 32 
<212> DNA 
20 <213> Artificial Sequence 

<400> 44 

tcgaggatcc gcggccgcaa gcttcctgca gg 32 

25 <210> 45 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
30 <400> 45 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 36 

* 

<210> 46 

<211> 28 * 
35 <212> DNA 

<213> Artificial Sequence 

<400> 46 

cctgcaggaa gcttgcggcc gcggatcc 28 
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<210> 47 
<211> 36 
<212> DNA 
5 <213> Artificial Sequence 

<400> 47 

tcgacctgca ggaagcttgc ggccgcggat ccagct 36 

10 <210> 48 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
15 <400> 48 

ggatccgcgg ccgcaagctt cctgcagg 2 8 

<210> 49 
<211> 39 
20 <212> DNA 

<213> Artificial Sequence 

<400> 49 

gatcacctgc aggaagcttg cggccgcgga tccaatgca 39 



25 



30 



<210> 50 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<4O0> 50 

ttggatccgc ggccgcaagc ttcctgcagg t 31 



<210> 51 
35 <211> 41 
<212> DNA 

<213> Artificial Sequence 
<400> 51 
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ggatccgcgg ccgcacaatg gagtctctgc tctctagttc t 41 

■ 

<210> 52 
<211> 38 
<212> DNA 

<213> Artificial Sequence 



<400> 52 

ggatcctgca ggtcacttca aaaaaggtaa cagcaagt 38 

<210> 53 
<211> 45 
<212> DNA 

<213> Artificial Sequence 



<400> 53 

ggatccgcgg ccgcacaatg gcgttttttg ggctctcccg tgttt 45 

<210> 54 
20 <211> 40 
<212> DNA 

<213> Artificial Sequence 
<400> 54 

25 ggatcctgca ggttattgaa aacttcttcc aagtacaact 40 

<210> 55 
<211> 38 
<212> DNA 
30 <213> Artificial Sequence 

<400> 55 

ggatccgcgg ccgcacaatg tggcgaagat ctgttgtt 38 

35 <210> 56 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
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ggatcctgca ggtcatggag agtagaagga aggagct 37 
<210> 57 

* 

5 <211> 50 
<212> DNA 

<213> Artificial Sequence 
<400> 57 

10 ggatccgcgg ccgcacaatg gtacttgccg aggttccaaa gcttgcctct 50 

<210> 58 
<211> 38 
<212> DNA 
15 <213> Artificial Sequence 

'<400> 58 

ggatcctgca ggtcacttgt ttctggtgat gactctat 38 

20 <210> 59 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
25 <400> 59 

ggatccgcgg ccgcacaatg acttcgattc tcaacact 38 

<210> 60 
<211> 36 
30 <212> DNA 

<213> Artificial Sequence 

<400> 60 

ggatcctgca ggtcagtgtt gcgatgctaa tgccgt 36 



35 



<210> 61 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<400> 61 

taatgtgtac attgtcggcc tc 22 

5 <210> 62 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
10 <400> 62 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt ccacaattcc ccgcaccgtc 60 

<210> 63 
<211> 22 
15 <212> DNA 

<213> Artificial Sequence 

<400> 63 

aggctaataa gcacaaatgg ga 22 



20 



25 



<210> 64 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<400> 64 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggaattgg tttaggttat 60 
ccc 63 



30 <210> 65 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
35 <400> 65 

ggatccatgg ttgcccaaac cccatc 26 

<210> 66 
<211> 61 
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<212> DNA 

<213> Artificial Sequence 
<400> 66 

5 gcaatgtaac atcagagatt ttgagacaca acgtggcttt gggtaagcaa caatgaccgg 60 
c 61 



<210> 67 
<211> 25 
10 <212> DNA 

<213> Artificial Sequence 

<400> 67 

gaattctcaa agccagccca gtaac 25 



15 



20 



<210> 68 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<400> 68 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgggtgcga aaagggtttt 60 
ccc 63 



25 <210> 69 
<211> 23 
<212> DNA 

<213> Artificial Sequence 

m 

30 <400> 69 

ccagtggttt aggctgtgtg gtc 23 

<210> 70 
<211> 21 
35 <212> DNA 

<213> Artificial Sequence 

<400> 70 

ctgagttgga tgtattggat c 21 
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<210> 71 
<211> 28 
<212> DNA 
5 <213> Artificial Sequence 

<400> 71 

ggatccatgg ttacttcgac aaaaatcc 28 

10 <210> 72 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
15 <400> 72 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt gctaggcaac cgcttagtac 60 

<210> 73 

<211> 28 

20 <212> DNA 

<213> Artificial Sequence 

<400> 73 

gaattcttaa cccaacagta aagttccc 28 



25 



30 



<210> 74 
<211> 63 
<212> DNA 

<213> Artificial Sequence 



<400> 74 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgccggcat tgtcttttac 60 
atg 63 

35 <210> 75 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<400> 75 

ggaacccttg cagccgcttc 20 

<210> 76 
5 <211> 22 
<212> DNA 

<213> Artificial Sequence 
<400> 76 

10 gtatgcccaa ctggtgcaga gg 22 

<210> 77 
<211> 28 
<212> DNA 
15 <213> Artificial Sequence 

* 

<400> 77 

ggatccatgt ctgacacaca aaataccg 28 

20 <210> 78 
<211> 62 
<212> DNA 

<213> Artificial Sequence 
25 <400> 78 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt cgccaatacc agccaccaac 60 
ag 62 

<210> 79 
30 <211> 27 
<212> DNA 

<213> Artificial Sequence 
<400> 79 

35 gaattctcaa atccccgcat ggcctag 27 

<210> 80 
<211> 65 
<212> DNA 
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<213> Artificial Sequence 

<400> 80 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggcctacg gcttggacgt 60 
5 gtggg 65 

<210> 81 
<211> 21 
<212> DNA 
10 <213> Artificial Sequence 

<400> 81 

cacttggatt cccctgatct g 21 

15 <210> 82 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
20 <400> 82 

gcaatacccg cttggaaaac g 21 

<210> 83 
<211> 29 
25 <212> DNA 

<213> Artificial Sequence 



<400> 83 

ggatccatga ccgaatcttc gcccctagc 29 

<210> 84 
<211> 61 
<212> DNA 

<213> Artificial Sequence 



30 



35 



<400> 84 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt caatcctagg tagccgaggc 60 
9 61 
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<210> 85 

<211> 27 

<212> DNA 

<213> Artificial Sequence 



<400> 85 

gaattcttag cccaggccag cccagcc 27 

<210> 86 
10 <211> 66 
<212> DNA 

<213> Artificial Sequence 
<400> 86 

15 ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggggaatt gatttgttta 60 
attacc 66 

<210> 87 
<211> 21 
20 <212> DNA 

<213> Artificial Sequence 



<400> 87 

gcgatcgcca ttatcgcttg g 21 

<210> 88 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<400> 88 

gcagactggc aattatcagt aacg 24 



<210> 89 
35 <211> 25 
<212> DNA 

<213> Artificial Sequence 
<400> 89 
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ccatggattc gagtaaagtt gtcgc 

<210> 90 
<211> 0 

5 <213> Artificial Sequence 
<400> 90 

gaattcactt caaaaaaggt aacag 

10 <210> 91 

<211> 4550 
<212> DNA 

<213> Arabidopsis sp 

15 <400> 91 

attttacacc aatttgatca cttaactaaa 
tttttgagca ttaaaccata aaaccatagt 

cgattaagat taggaaaaat ttataaccgg 

■t 

taaatgccga ttcctccctt gtctaaaaga 

20 tgtttcactc tatttaattt caggcacaat 
caacacgtga tacttttcct cgtccgtcag 
caaatctaca ccacattttt tgcttaatct 
agtctaacta attcttctaa tataagtaca 
taattttcaa aatctaatct aaatatctaa 

25 aatgacacca attaatcatc ctcgacccac 
ttttttgctc tctgttcctt caaaatcatt 
ttctttgtct ttgatttttg attttttttc 
atggagtctc tgctctctag ttcttctctt 
gtttcaggtt ttatttgttg tttaggtttc 

30 ttgaactttt ctgaatataa aataaggaaa 
tagatcgaag taggtgacaa aggttattgt 
tgaattttgt ttctcatgca tgcaacttat 
cagaatctaa agctccactc tttatcaggt 
aatactcaat catcttagtc tcattattct 

35 tttatgagac aatgtatgtt ggacttagtt 
gttactgatg ttgtttagct ctttacacca 
gttctgcgtt gtgattcgag taaagttgtc 
aggcctgatg gtcaaggatc ttcattgttg 
gttaatgcca ctgcgggtca gcctgaggct 



PCTAJS01/12334 

25 



ttaattaaat 


tagatgatta tcccaccata 


60 


tataagtaac 


tgttttaatc gaatatgact 


120 


taattaagaa 


aacattaacc gtagtaaccg 


180 


cagaaaacat 


atattttatt ttgccccata 


240 


acttttggtt 


ggtaacaaaa ctaaaaagga 


300 


tcagattttt 


tttaaactag aaacaagtgg 


360 


attaacttgt 


aagttttaaa ttcctaaaaa 


420 


ttccctaaat 


ttcccaaaaa gtcaaattaa 


480 


taattcaaaa 


tcattaaaaa gacacgcaac 


540 


acaattctac 


agttctcatg ctaaaccata 


600 


tctttctctt 


ctttgattcc caaagatcac 


660 


tctctggcgt 


gaaggaagaa gctttatttc 


720 


gtttccgctg 


gtaaatctcg tccttttctg 


780 


gttt.ttgtga 


ttcagaacca tacaaaaagt 


840 


aagtttcgat 


ttttataatg aattgtttac 


900 


gtggagaagc 


ataatttctg ggcttgactt 


960 


caatcagctg 


gtgggttttg ttggaagaag 


1020 


tcgttagggt 


• 

tttatgggtt tttgaaatta 


1080 


attggttgaa 


tcacattttc taatttggaa 


1140 


gaagttcttc 


tctttggtta tagttgaagt 


1200 


atatatacac 


ccaattttgc agaaatccga 


1260 


gcaaaaccga 


agtttaggaa caatcttgtt 


1320 


ttgtatccaa 


aacataagtc gagatttcgg 


1380 


ttcgactcga 


atagcaaaca gaagtctttt 


1440 
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agagactcgt tagatgcgtt ttacaggttt 
aagtttctct ttaaaaatgt aactctttta 
taacattagc tctgtgattg gatttgcagg 
cagtagagaa ggtttctgat atatctcctt 
5 atatataaca cataatgacc gatgaagaag 
attgggtttt gttttcaggc tgttgttgca 
ctaaatcagt tgtctgatgt tgaaatagat 
ttcgagagac tgatgagatt aatagcagct 
gcaggttaac aagccctatc ttccattggc 

10 tgcaatagta gcttccttct ccatcatggt 
tagaattcta taagttactg aaatagtttg 
gtggattgtt ggttcatggc cattgttctg 
tgcatactct atcaatgtaa gtaagtttct 
tgcagtttct agttttaggt taatgaggtt 

15 ccacttttac ggtggaaaag atttgcattg 
gctattattg ttcaaatcgc cttttatcta 
gttttgtagt tgttttcatc aaaatcactt 
cagacacatg tgtttggaag accaatcttg 
tttatgagct ttttctctgt cgttattgca 

20 ttaaatctat gtatacttaa agtaaagcat 
ttggttggat gcaggatata cctgatatcg 
tctctgtaac tctgggtcag aaacgggtac 
caagtgttgg attaagatta cagaagaaag 
gtgttttgga catgtgttac actacttcaa 

25 gccacatctc cattcatatg gagcaaagtc 
aaaactcgct aattcatcgt ttgagtggta 
ttttttcagg ttgtgggtca tgttatactc 
gttgatctga gtagcaaaac cgaaataact 
ttcgtttata aatagagtct ttactgcctt 

3 0 cctttcagtt tcatcgaatc accattatac 
ttatgcagag tacttgctgt tacctttttt 
ggagataaaa gaataagtca tcactatgct 
gtagtgaact agtgaattag agttttattc 
aagatatgaa tttctgttgg gtaaagaagt 

35 gtgttgatat aatgctaagc gaagaaatcg 
taaacatgtc agaacatctc cattctatat 
acctaaactc tttatctctg tgtagttaag 
gttgatgtaa tttgcagaac gtatggattt 
gtttatatat atggataatt cagacctaac 



mm km mmm *m± mm, 1> mmm +-% A a f»/Hia km *«1 km 

tctaggcctc atacagccat. 


4m, - m j ■ mm, mm mm mm /wk km 

cggcacagc z 


1500 


— ^_ — ■ m m± k^ mm ■> km mm _~_ m m —mm m^rn m^mkm> Am 

aaacgcaatc tttcagggtt 


ttcaaggaga 


1560 


■ _ k_ *_ ^_ mm Am. mm km mm mm A— - A— 

tgcttagcat tttatctgta 


tctttcttag 


1620 


tacttttcac tggcatcttg 


gaggtaatga 


1680 


atacattttt ttcgtctctc 


tgtttaaaca 


m\ mm) m mm 

1740 


gctctcatga tgaacattta 


catagttggg 


1800 


aaggtaacat gcaaattttc 


ttcatatgag 


1860 


agtgcctaga tcatctctat 


gtgggttttt 


1920 


atcaggagaa tattctgtta 


acaccggcat 


mm mm. mm -m 

1980 


atggtgccat tttcacaaaa 


■ ■ X . \ ■ _ ■ 

tttcaacttt 


mm. mm ^ mm 

204 0 


ttataaatcg ttatagagtt 


tctggcttgg 


2100 


ggctcttttt gtgagtttca 


tgctcggtac 


2160 


caatactaga atttggctca 


aatcaaaatc 


• 

2220 


ttaataactt acttctacta 


caaacagttg 


2280 


gttgcagcaa tgtgtatcct 


cgctgtccga 


2340 


catattcagg tactaaacca 


ttttccttat 


2400 


ttatattact aaagctgtga 


aactttgttg 


2460 


ttcactaggc ctcttatttt 


cgccactgcg 


2520 


ttgtttaagg taaacaaaga 


tggaaaaaga 


2580 


tctactgtta ttgatgagaa 


gfetttctttt 


2640 


aaggggataa gatattcgga 


atccgatcat 


2700 


gatatctaaa ctaaagaaat 


tgttttgact 


2760 


aaaactgttt ttgtttcttg 


caaaattcag 


2820 


atggcttacg ctgttgcaat 


tctagttgga 


mm. m~\ f\ *\ 

2880 


atctcggtaa caatctttct 


ttacccatcg 


2940 


ctggtttcat tttgttccgt 


tctgttgatt 


^ m*m. A m*m 

3000 


gcaacaactt tgtgggctcg 


agctaagtcc 


3060 


tcatgttata tgttcatatg 


gaaggttaga 


3120 


tttatgcgct ccaatttgga 


attaaaatag 


3180 


tgataaactc ccatccccgc 


accagctctt 


3240 


m*Hw m^K i~ f km m*m mm* mm ^mt m^ «k ^m* mm mm. m*m mm p*^ _r j 

gaagtgactg acattagaag 


agaagaagat 


3300 


X— mm km 4— ■ 1— lv 4— — X* km — i .M a — i. t ■ 1— — * 

cccgttttta ctacaagccc 


«n km -mm mm. *m. mm tmt mm, mmm 

acgaaat cag 


*s ~i c r\ 

3360 


m*m _|— ■ ^« <bk r* _r _|— j fmmr mm. Mm\ ^m* mmm am* mm 

tgaaacatgg cagactgcaa 


aaatatgtca 


3420 


ctctgcttgg gcaaaatctt 

* 


aaggttcggt 


3480 


attctatgta gaaatttccg 


aaactatgtg 


3540 


cttcttctgc aagaaagctc 


tgtttttatc 


3600 


atatgtatat gtacgtgact 


acattttttt 


3660 


ttgttagaaa gcatgagttc 


gaaagtatat 


3720 


gtcgaagctc acaagcataa 


attcactact 


3780 
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atagtttgct ctgtaataga tagttccatt 
gcgttttgtg gttgatactg actactgagt 
agaagaatat aggctcacgg gaacgactgt 
gcggctttgc caaagaccga gtcacgatcg 
5 tgattgacca ttgcttagag acgcattgga 
tcaagtacgt gtcagatcat acgatgtagg 
agtcacaatg cttaatgggc ttattggccc 
ttcgtttgtc ccctggtggt gagtattatt 
tagagtgaat ctagtagagt cctagaccat 
10 atgaacaatt ctttttgtaa ggaaaacttt 
agttgaacta acttcgtgca attgcataat 
ataaacattt cgacgtacca agagttcgaa 
gactaatttg tacaatgaat ggttaataaa 

15 <210> 92 

<211> 4450 

<212> DNA 

<213> Arabidopsis sp 

20 <400> 92 

tttaggttac aaaatcaatg atattgcgta 
cttgtttgac cagaaggtca tgatcattgt 
aaagacatgg atcccaaaca acaacaatag 
cactaatcta aaagagttaa gtttcagctt 

25 cctgaaggag acccactttg tagcaagacc 
aaaagtctac ttcaattctt catatatagg 
ttgacagaga gagagtcttt attgaaaact 
agcaccaaac cacttgttcg acacaaatct 
caaggcaaat cacataattg gattgtgaaa 

3 0 ctttctactg cagtcagcac cagatgataa 
tcctgatgca gcggccagtg atgcgtaata 
aaaaagaatc aaaagacagt aaatggaatt 
attgagtacc gagatctgca ctgaatccag 
caaatccagt taaccaaagc tttgtattat 

3 5 caacttttac atcatcttct ttgtcctgga 
aaaaaaatga tttaacctag aatatctcaa 
gaaattttgg gttcgtagct tgtggcatat 
aactttcttt tctcacttct gttgcaaacg 
taaagtatag aaatcagatg gaaaaggtgg 



gatgtcttga 


aactgtacgt aactgcctgg 


3840 


gttctttgtg 


agtgttgtaa gtatacaaga 


3900 


ggtggaagat 


gaaatggaga tcatcacgta 


3960 


agtctatgaa 


gtctttacag ctgctgatta 


4020 


atcttactag 


ggacttgcct gggagtttct 


4080 


agatttcacg 


gctttgatgt gtttgtttgg 


4140 


aataatagct 


agctcttttg ctttagccgt 


4200 


agggtatggt 


gtgaccaaag tcaccagacc 


4260 


ggtccatggc 


ttttatttgt aatttgaaaa 


4320 


tatatagtag 


acgtttacta tatagaaact 


4380 


aatggtgtga 


aatagagggt gcaaaactca 


4440 


acaataagca 


aaatagattt ttttgcttca 


4500 


ccattgaagc 


ttttattaat 


4550 



tgtcaactat 


aaaagccaaa agtaaagcct 


60 


atacatacag 


ccaaactacc tcctggaaga 


120 


cttcttttac 


aagaaccagt agtaactagt 


180 


ttctggcaat 


ggctccttga tcatttcaat 


240 


atgtcctctg 


tttcacttac agtgtgtctc 


300 


ttcctcacac 


tacagcttca tcctcattcg 


360 


tcttccaagt 


acaactccac taaatataat 


420 


gtacagatat 


aaaaacacta ttaggttttc 


480 


gagtacaaaa 


gataaaccca aattttcata 


540 


gtcagctgtc 


cctatttgcc atcctaactg 


600 


ttgccaccct 


taatcattag agcgagaaac 


660 


aggaatcaca 


aatgagtcct tgtaaagttt 


720 


aaagtgcaag 


aaaacctatg gatgctgtgc 


780 


caccgaatct 


aagggctgtt gacttaacac 


840 


gacacaatat 


attagacatt agtccatgga 


900 


aattacttgc 


ataaaaactg aacttgagct 


960 


actatttcat 


tttcaatggg ccacaaaggt 


1020 


ggaagacttt 


tatggggcta actcttcact 


1080 


gagatcaggg 


taattttctt ctttatgatt 


1140 
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gacaaaagtc gaacatcgaa atggatgcat 
agaaatctgt ggtggtgaag ctagaaaaag 
gattaactac tttgctactg gtcataatca 
gaatatacct gatgtgcata aatagtatca 
5 tagagaggga gtacaataga tggtgctatg 
gctccccagt ttatggtcaa acctaaaaag 
atcataagaa aatcagaaaa tatataatgt 
ttacccaaaa tgtaaacctc ttcataagtg 
cccctaaaac acggctgcag aatatacata 

10 atcacaaaac taaagacaag acctgagaac 
gggtgaacca tatgtgtatg tgaattttta 
gcaagtaaaa aatccaaaca aacctgtaat 
aaagcaactg cagcccgaga aatccaatcc 
taggtcttag ttttgtacga tcaacctgga 

15 taaaacaaaa caaccataca aaatcttgag 
tggaagaatg aatccagtta catgaatgct 
ttcaatcgaa aaacatattc caccttcacc 
gaacgaagtc atcagaacat gcagataagc 
tgttgtcgta aattgatcca acatagaaaa 

20 aacactttcc caccatggtt acagaaacca 
catactaaag ggatatataa atttgacatc 
aaacaaactg acctttgtat ctatgtcctg 
acctctaaga agtaatgctc cgcaaccaaa 
tccaggatca gcagccaacg caatcgacct 

25 atctatttac atagctctgg aactagatcc 
ctaaagactt ccaaacagat tcctgagtaa 
tatataaaat caaagaaaac tcaggtttat 
cttaaccact ctcccatgct atcaaaaacc 
atgagctctt gggaagatca ttatggattt 

30 agactgcaag aactactcca aacttctcca 
cagacataaa ttcttttatc aagcttcaag 
accaaccagg aaaacacata actttatcac 
acataaacca tcctttggga cgaaaggaaa 
agctattctt tcggatggat tataatgaat 

35 acattactca aaggcgaaga taaacttacc 
caatgggttt atccaatcga gcaagcttag 
aatctatcca agaagcttcc ttaacaacaa 
tcggctttcc ctccaaaacc gaagaagacg 
aaccaacacc aaaaaacttc tcctgatgca 



PCTAJS01/12334 



ttgcatgaga 


catgaaacaa aagctgaaaa 


"1 ^ ft ft 

1200 


aaaacaaagc 


aagcaatatg cacacattga 


1260 


aatagatttt 


gaagctaaaa aataaaaagt 


1320 


taaacaaggg 


tccagcagac tccggagaga 


1380 


cttcctttaa 


ctgcagtcca tcctaacaat 


1440 


gcttgaggct 


gcaattataa aaacgaatca 


1500 


ctaactttga 


gaagccagaa tagatttaaa 


1560 


ggtaggaaaa 


gacaagtaac aaagatgaag 


mm mm mW- 

1620 


ctgaaatgag 


ctcaagtaga aaagaatttg 


1680 


atatcttcag 


aatttgggcc aactacataa 


1740 


aacaaacact 


tgcaaatacg cgactttagg 


1800 


tgttaagttg 


gagaagaatc cctaagccta 


1860 


cttgaaatgg 


tgtcaaaaga ccactggcga 


1920 


tataaaagaa 


atttgtaaga caacataatc 


1980 


ctttacatac 


aagcaaccca tctttgttta' 


2040 


gtgtatctac 


cctaactact aaacacatat 


2100 


atatctaaca 


cctgaagtct ttcacttttt 


2160 


tattacccaa 


aacagagata tgactggaaa 


2220 


atcaagacca 


gttccagatg tcaaagcaat 


2280 


tagttacaca 


aaacatgttt cctaaaccaa 


2340 


actttatcac 


cataccataa gatagcttaa 


2400 


atcaagcaga 


tcatttatag tacaaccagc 


2460 

■ 


taaagccata 


tatttaaaac ttggaaggct 


2520 


atacaacaat 


gatggagatt cagagtatcg 


2580 


atgacgaaac 


atggaacatc gttataatat 


2640 


gaaacccagt 


ggaactatag tactgtaaca 


2700 


agcattatcc 


aatcctgatt tctgccaatc 


mam. mam ^mt * 

2760 


tcagctcaag 


atcatactac ctaattgcct 


m**L J*m mT*\ mWm 

2820 


gataactgaa 


aaaagtaaca gagaaatagc 


O O ft 

2880 


ctgatatgta 


tgtagtctaa caataataaa 


•1 ft A ft 

2940 


agcaagttag 


tcagaaaaca 'tCacagccaa 


3000 


ataaaactaa 


atttaatgta atctgactta 


3060 


ctatataaac 


atgcagtctt tctttccctc 


3120 


ctcaaaagtg 


aaatgtcttg attctcagct 


3180 


acatacaagg 


ccacgcaagc aaccaagttc 


3240 


cataacctct 


aacttcttct ggtaaataca 


3300 


caccatcact 


cttctcctta tcatctttct 


3360 


acgacattcc 


acaaattaat ctgtaattcc 


3420 


attctcttcc 


tttactccat acttggtaat 


3480 
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tatcattcca tgaaggataa cacttagtga 
tggacaagga tttatgttgt gattgcaaaa 
cggaagattt caacaaccgt cttgaaacac 
gaaattgttg cc tggaagaa acaaagactt 
5 acgaaagcta acttctcaag agaatcagat 
ctaatttcag tttcgagtga tgaagcctta 
ctctcagaga taatcgaatt ccttaaacaa 
acaacaaaaa aaatcagatt aacaaccgac 
agcacgtcgt ctcggagcaa gacttcttct 

10 tgtagattat tatatttggg ccgaaacaat 
gaaacacgta cagtatgcat ttaggctcca 
aaactaatca acccctacct tacttatttc 
aagaaacact tttatttatc ttttcgggac 
tgagcgtcag acacatatta gccttatcag 

15 aagaagcaac aatcaatgtc ggagaaatta 
tgtgcatatg gagtcagttg ccgatcatat 
actaatagat 

<210> 93 
20 <211> 2850 
<212> DNA 

<213> Arabidopsis sp 
<400> 93 

25 aattaaaatt tgagcggtct aaaccattag 
gtcgattttc acgtcttgaa catatattgg 
tattggtcaa agaaaaacaa ccatggccca 
gtacccctga attctctgaa atatatttga 
tcctgtaaaa gtgaaggagt caccgtgact 

30 gatgaagaag tttctctcgt tctcctccaa 
tggcgaagat ctgttgttta tcgtttctct 
aaccctagac tgattccttg gtcccgcgaa 
ccggtctcga cggaatcaac tgctaagtta 
cgagtttttg ccactgctac tgccgccgct 

35 tctagagttg cggctttggc tggattaggg 
tctaaagcta aacttaggta tgtgtttact 
ccaattgttg gattcttaaa ttctcatttg 
acttctggaa ctgggtatat tctgggtacg 
tgttacacat gtgcaggaac catgatgatt 



aaggatttgt gtaatgggta 


gtcacaggat 


3540 


gagcagagga agaagatgga 


gttacggaga 


3600 


gggagagccc aaaaaacgcc 


atctttgaga 


3660 


gagatttcaa acgtaagtga 


attcttacga 


3720 


tagtgattcc tcaaaaacaa 


acaaaactat 


3780 


agaatctaga acctccatgg 


cgtttctaat 


3840 


tcaaagctta gaaagagaag 


aacaacaaca 


3900 


cagagagcaa cgacgacgcc 


ggcgagaaag 


3960 


ccagtaaccc ggatggatcg 


ttaatgggcc 


4020 


tgggtcagca aaaacttggg 


ggataatgaa 


4080 


aattaattgg ccatataatt 

• 


cgaatcagat 


4140 


tcactgtttt tatttctacc 


ttagtagttg 


4200 


ccaaatttga taggatcggg 


ccattactca 


4260 


attagtgggg taaggttttt 


ttaattcggt 


4320 


aagaatctgc atgggcgtgg 


cgtgatgata 


4380 


ataactattt ataaactaca 


tataaagact 


4440 






4450 



accgtttaga gatccctcca 


acccaaaata 


60 


gccttaatct gtgtggttag 


taaagacttt 


120 


acatgttgat acttttattt 


aattatacaa 


180 


ttgacccaga tattaatttt 


aattatcatt 


240 


cgtcgtaatc tgaaaccaat 


ctgttcatat 


300 


cgcgtagaaa attctgacgg 


cttaacgatg 


360 


tcaagaatct ctgtttcttc 


ttcgttacca 


420 


ttatgtgccg ttaatagctt 


ctcccagcct 


480 


gggatcactg gtgttagatc 


tgatgccaat 


540 


acagctacag ctaccaccgg 


tgagatttcg 


600 


catcactacg ctcgttgtta 


ttgggagctt 


660 


tttcttttct catgaaaaat 


ctgaaaattt 


720 


ttttatggtt gtagtatgct 


tgtggttgca 


780 


ggaaatgctg caattagctt 


cccggggctt 


840 


gctgcatctg ctaattcctt 


gaatcaggtc 


900 
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attgaaatgt tgagaagttc ataaatttcg aatccttgtt gtgtttatgt agttgatctt 960 

gcttgcttat gtttatgtag ttgaaaagtt taaaaatttc taatccttgg tagttgatct 1020 

cgcttgtttg ttttttcatt ttctagattt ttgagataag caatgattct aagatgaaaa 1080 

gaacgatgct aaggccattg ccttcaggac gtattagtgt tccacacgct gttgcatggg 1140 

5 ctactattgc tggtgcttct ggtgcttgtt tgttggccag caaggtgaat gtttgttttt 1200 

ttatatgtga tttctttgtt ttatgaatgg gtgattgaga gattatggat ctaaactttt 1260 

gcttccacga caaggttatt gcagactaat atgttggctg ctggacttgc atctgccaat 1320 

cttgtacttt atgcgtttgt ttatactccg ttgaagcaac ttcaccctat caatacatgg 1380 

gttggcgctg ttgttggtgc tatcccaccc ttgcttgggt aaatttttgt tccttttctt 1440 

10 ctttatttta gcagattctg ttttgttgga tactgctttt aattcaaaat gtagtcatgg 1500 

ttcaccaatt ctatgcttat ctattttgtg tgttgtcagg tgggcggcag cgtctggtca 1560 

gatttcatac aattcgatga ttcttccagc tgctctttac ttttggcaga tacctcattt 1620 

tatggccctt gcacatctct gccgcaatga ttatgcagct ggagggtaag accatatggt 1680 

gtcatatgag attagaatgt ctccttccat gtagtgttga tcttgaacta gttcaatttc 1740 

15 gtggaatgat cagagtgt cc tagatagtgt cacagcagtc gacattttag tggctagata 1800 

atgagttctt tccgttagag ataaacattc gcgaacattg tttccagctt ccgcgaccca 1860 

acttctgatt ttgtttcttg gtaccttgtt ttcagttaca agatgttgtc actctttgat 1920 

ccgtcaggga agagaatagc agcagtggct ctaaggaact gcttttacat gatccctctc 1980 

ggtttcatcg cctatgactg^ tgagtcttgt agattcatct tttttttgta gtttattgac 2040 

20 tgcattgctg tatctgattt ttgctgttcc ttccaatttt tgtgacaggg gggttaacct 2100 

caagttggtt ttgcctcgaa tcaacacttc tcacactagc aatcgctgca acagcatttt 2160 

cattctaccg agaccggacc atgcataaag caaggaaaat gttccatgcc agtcttctct 2220 

tccttcctgt tttcatgtct ggtcttcttc tacaccgtgt ctctaatgat aatcagcaac 2280 

aactcgtaga agaagccgga ttaacaaatt ctgtatctgg tgaagtcaaa actcagaggc 2340 

25 gaaagaaacg tgtggctcaa cctccggtgg cttatgcctc tgctgcaccg tttcctttcc 2400 

tcccagctcc ttccttctac tctccatgat aacctttaag caagctattg aatttttgga 2460 

aacagaaatt aaaaaaaaaa tctgaaaagt tcttaagttt aatctttggt taataatgaa 2520 

gtggagaacg catacaagtt tatgtatttt ttctcatctc cacataattg tattttttct 2580 

ctaagtatgt ttcaaatgat acaaaataca tactttatca attatctgat caaattgatg 2640 

■ 

3 0 aatttttgag ctttgacgtg ttaggtctat ctaataaacg tagtaacgaa tttggttttg 2700 

gaaatgaaat ccgataaccg atgatggtgt agagttaaac gattaaaccg ggttggttaa 2760 
aggtctcgag tctcgacggc tgcggaaatc ggaaaatcac gattgaggac tttgagctgc . 2820 

cacgaagatg gcgatgaggt tgaaatcaat 2850 

35 <210> 94 

<211> 3660 
<212> DNA 

<213> Arabidopsis sp 
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<400> 94 

tatttgtatt tttattgtta aattttatga tttcacccgg tatatatcat cccatattaa 60 

tattagattt attttttggg ctttatttgg gttttcgatt taaactgggc ccattctgct 120 

tcaatgaaac cctaatgggt tttgtttggg ctttggattt aaaccgggcc cattctgctt 180 

5 caatgaaggt cctttgtcca acaaaactaa catccgacac aactagtatt gccaagagga 240 

tcgtgccaca tggcagttat tgaatcaaag gccgccaaaa ctgtaacgta gacattactt 300 

atctccggta acggacaacc actcgtttcc cgaaacagca actcacagac tcacaccact 360 

ccagtctccg gcttaactac caccagagac gattctctct tccgtcggtt ctatgacttc 420 

gattctcaac actgtctcca ccatccactc ttccagagtt acctccgtcg atcgagtcgg 480 

10 agtcctctct cttcggaatt cggattccgt tgagttcact cgccggcgtt ctggtttctc 540 

gacgttgatc tacgaatcac ccggtagtta gcattctgtt ggatagattg atgaatgttt 600 

tcttcgattt tttttttact gatcttgttg tggatctctc gtagggcgga gatttgttgt 660 

gcgtgcggcg gagactgata ctgataaagg tatgattttt tagttgtttt tattttctct 720 

ctcttcaaaa ttctcttttc aaacactgtg gcgtttgaat ttccgacggc agttaaatct 780 

• 15 cagacacctg acaaggcacc agccggtggt tcaagcatta accagcttct cggtatcaaa 840 

ggagcatctc aagaaactgt aattttgttc atctcctcag aatcttttaa attatcatat 900 

ttgtggataa tgatgtgtta gtttaggaat tttcctacta aaggtaatct cttttgagga 960 

caagtcttgt ttttagctta gaaatgatgt gaaaatgttg tttgttagct aaaaagagtt 1020 

tgttgttata ttctgtattc agaataaatg gaagattcgt cttcagctta caaaaccagt 1080 

20 cacttggcct ccactggttt ggggagtcgt ctgtggtgct gctgcttcag gtaatcatac 1140 

gaacctcttt tggatcatgc aatactgtac agaaagtttt ttcattttcc ttccaattgt 1200 

ttcttctggc agggaacttt cattggaccc cagaggatgt tgctaagtcg attctttgca 1260 

tgatgatgtc tggtccttgt cttactggct atacacaggt ctggttttac acaacaaaaa 1320 

gctgacttgt tcttattcta gtgcatttgc ttggtgctac aataacctag acttgtcgat 1380 

25 ttccagacaa tcaacgactg gtatgataga gatatcgacg caattaatga gccatatcgt 1440 

ccaattccat ctggagcaat atcagagcca gaggtaactg agacagaaca ttgtgagctt 1500 

ttatctcttt tgtgattctg atttctcctt actccttaaa atgcaggtta ttacacaagt 1560 

ctgggtgcta ttattgggag gtcttggtat tgctggaata ttagatgtgt gggtaagttg 1620 

gcccttctga cattaactag tacagttaaa gggcacatca gatttgctaa aatcttccct 1680 

3 0 tatcaggcag ggcataccac tcccactgtc ttctatcttg ctttgggagg atcattgcta 1740 

tcttatatat actctgctcc acctcttaag gtaagtttta ttcctaactt ccactctcta' 1800 

gtgataagac actccatcca agttttggag ttttgaatat cgatatctga actgatctca 1860 

ttgcagctaa aacaaaatgg atgggttgga aattttgcac ttggagcaag ctatattagt 1920 

ttgccatggt aagatatctc gtgtatcaat aatatatggc gttgttctca tctcattgat 1980 

35 ttgtttcttg ctcacttgac tgataggtgg gctggccaag cattgtttgg cactcttacg 2040 

ccagatgttg ttgttctaac actcttgtac agcatagctg gggtactctt ttggcaaacc 2100 

ttttatgttg cttttttcgt tatctgttgt aatatgctct tgcttcatgt tgtacctttg 2160 

tgataatgca gttaggaata gccattgtta acgacttcaa aagtgttgaa ggagatagag 2220 

cattaggact tcagtctctc ccagtagctt ttggcaccga aactgcaaaa tggatatgcg 2280 

48 



WO 01/79472 

ttggtgctat agacattact cagctttctg 
gcagctgtgg cttctatttc ttttccttga 
agcacaaatt aatgaagctg aatcaacaaa 
tgagctaatg aagaggaggc atctactttt 
5 atttcatgct tctaaaacaa gtattttcaa 
tcatttgtac ttttactagt ggatgagtta 
gtagagatca tcattagtat atgtctattt 
aaccttatta tgcgttggcg ttggttgctt 
agacgttaac agtctcacat tataattaat 

10 ctcgcttcta taaactgcag tttaaatact 
agtaccaggt aagtcaactt agtacacatg 
tctcttaatc agaagttgct tgaaacactc 
cttggtgctc ggaatatttg taacggcatt 
gatggggttt tgtcgaaagc agaggtgttg 

15 aactagttta aaagattttg taaaatgtat 
gtatcaattt agcaaaacgg ctgagaaatt 
ttttgcattt cctgctcata tcgaggattg 
tttcagaatg tttttgtttt ctgtagtgga 
attctaaaca tgtatccaca taaaaacagt 

20 tttataatct aaatctaaca actagctagt 
gaaactacaa agactagact atacatatgt 
acctgatttt tttctattct acagccattt 
cacgttgttg gacacaacat actatcacaa 

25 <210> 95 

<211> 1236 

<212> DNA 

<213> SOY 

30 <400> 95 

atggattcac tgcttcttcg atctttccct 

actggtgcaa atttctccag gactaaatct 

35 gtgccaaatg cttcatggca caataggaaa 

cggtggccaa gtttgaacca tcattacaaa 

tgtaatataa aatttgttgt gaaagcgacc 

40 

gcttttgatc caaaaagcat tttggactct 
ttttccaggc ctcacacagt tattggcaca 



PCT/US01/12334 



ttgccggtat 


gtactatcca 


ctgtttttgt 


2340 


tcttatcaac 


tggatattca 


ccaatggtaa 


2400 


ggcaaaacat 


aaaagtacat 


tctaatgaaa 


2460 


atgtttcatt 


agtgtgattg 


atggattttc 


2520 


cagtgtcatg 


aaataacaga 


acttatatct 


2580 


cacaatcatt 


gttatagaac 


caaatcaaag 


2640 


tggttgcagg 


atatctatta 


gcatctggga 


2700 


tgatcattcc 


tcagattgtg 


ttccaggtaa 


2760 


caaattcttg 


tcactcgtct 


gattgctaca 


2820 


ttctcaagga 


ccctgtcaaa 


tacgacgtca 


2880 


tttgtgttct 


tttgaaatat 


ctttgagagg 


2940 


atcttgatta 


caggcaagcg 


cgcagccatt 


3000 


agcatcgcaa 


cactgaaaaa 


ggcgtatttt 


3060 


acacatcaaa 


tgtgggcaag 


tgatggcatc 


3120 


gtaccgttat 


tactagaaac 


aactcctgtt 


3180 


gtaattgatg 


ttaccgtatt 


tgcgctccat 


3240 


gggtttatgt 


tagttctgtc 


acttctctgc 


3300 


ttttaactat 


tttcatcact 


ttttgtattg 


3360 


aatatacaaa 


aatgatactt 


cctcaaactt 


3420 


aacccaacta 


acttcataca 


attaatttga 


3480 


tatttaacaa 


cttgaaactg 


tgttattact 


3540 


gatatgctgc 


aatcttaaca 


tatcaagtct 


3600 


gtaagacacg 


aagtaaaacc 


aaccggcaac 


3660 


aatattaata 


acgcctcttc 


tctcaccacc 


60 


ttcgccaaca 


tttaccatgc 


aagttcttat 


120 


atccaaaaag 


aatataattt 


tttgaggttt 


180 


ggcattgagg 


gagcgtgtac 


atgtaaaaaa 


240 


tctgaaaaat 


ctcttgagtc 


tgaacctcaa 


300 


gtcaagaatt 


ccttggatgc 


tttctacagg 


360 


gcattaagca 


taatttctgt 


gtctcttctt 


420 
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gctgttgaga aaatatcaga tatatctcca 
gttgctgccc tgtttatgaa tatttatatt 

5 

atagacaaga taaacaagcc gtatcttcca 
ggtgtcacta ttgttgcatc tttttcaatt 
10 tcatggccat tattttgggc cctttttgta 
aatgtgcctc tgttgagatg gaagaggttt 
gttcgggcag taatagttca acttgcattt 

15 

aggccacctg tcttttcaag accattgatt 
gtagttatag cactgtttaa ggatatacct 
20 caatcttttt cagtgcgttt aggtcagaag 
gaaatagctt atggagtcgc cctcctggtg 
attttcacgg gtctgggaca cgctgtgctg 

25 

gtagatttga aaagcaaagc ttcgataaca 
tatgcagaat acttactcat tccttttgtt 

30 

<210> 96 
<211> 1188 
<212> DNA 
<213> SOY 

35 

<400> 96 

atggattcga tgcttcttcg atcttttcct 
actggttctt atttgccaaa tgcttcatgg 

40 

tttttgaggt ttcggtggcc aagtttgaac 
acatgtaaaa aatgtaatat aaaatttgtt 
45 tctgaacccc aagcttttga tccaaaaagc 
gctttctaca ggttttccag acctcacaca 
gtgtccctcc ttgctgttga gaaaatatca 

50 

ttggaggctg tggttgctgc cctgtttatg 
tctgatgttg aaatagacaa gataaacaag 
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ttatttttta ctggtgtgtt ggaggctgtg 480 

gttggtttga atcaattgtc tgatgttgaa 540 

ttagcatctg gggaatattc ctttgaaact 600 

ctgagttttt ggcttggctg ggttgtaggt 660 

agctttgtgc taggaactgc ttattcaatc 720 

gcagtgcttg cagcgatgtg cattctagct 780 

ttccttcaca tgcagactca tgtgtacaag 840 

tttgctactg cattcatgag cttcttctct 900 

gacattgaag gagataaagt atttggcatc 960 

ccggtgttct ggacttgtgt tacccttctt 1020 

ggagctgcat ctccttgtct ttggagcaaa 1080 

gcttcaattc tctggtttca tgccaaatct 1140 

tccttctata tgtttatttg gaagctattt 1200 

agatga 1236 



aatattaaca acgcttcttc tctcgccacc 60 

cacaatagga aaatccaaaa agaatataat 120 

caccattaca aaagcattga aggagggtgt 180 

gtgaaagcga cctctgaaaa atcttttgag 240 

attttggact ctgtcaagaa ttccttggat 300 

gttattggca cagcattaag cataatttct 360 

gatatatctc cattattttt tactggtgtg 420 

aatatttata ttgttggttt gaatcaattg 480 

ccgtatcttc cattagcatc tggggaatat 540 
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tcctttgaaa ctggtgtcac tattgttgca tctttttcaa ttctgagttt ttggcttggc 600 

tgggttgtag gttcatggcc attattttgg gccctttttg taagctttgt gctaggaact 660 

5 gcttattcaa tcaatgtgcc tctgttgaga tggaagaggt ttgcagtgct tgcagcgatg ■ 720 

tgcattctag ctgttcgggc agtaatagtt caacttgcat ttttccttca catccagact 780 

catgtataca agaggccacc tgtcttttca agatcattga tttttgctac tgcattcatg 840 

agcttcttct ctgtagttat agcactgttt aaggatatac ctgacattga aggagataaa 900 

gtatttggca tccaatcttt ttcagtgcgt ttaggtcaga agccggtatt ctggacttgt 960 

15 gttatccttc ttgaaatagc ttatggagtc gccctcctgg tgggagctgc atctccttgt 1020 

ctttggagca aaattgtcac gggtctggga cacgctgttc tggcttcaat tctctggttt 1080 

catgccaaat ctgtagattt gaaaagcaaa gcttcgataa catccttcta tatgtttatt 1140 

tggaagctat tttatgcaga atacttactc attccttttg ttagatga 1188 



10 



20 



<210> 97 

25 <211> 395 

<212> PRT 

<213> SOY 



30 



45 



<400> 97 



Met Asp Ser Met Leu Leu Arg Ser Phe Pro Asn He Asn Asn Ala Ser 
1 5 10 15 



Ser Leu Ala Thr Thr Gly Ser Tyr Leu Pro Asn Ala Ser Trp His Asn 
35 20 25 30 

Arg Lys He Gin Lys Glu Tyr Asn Phe Leu Arg Phe Arg Trp Pro Ser 
35 40 45 

40 Leu Asn His His Tyr Lys Ser He Glu Gly Gly Cys Thr Cys Lys Lys 
50 55 60 



Cys Asn He Lys Phe Val Val Lys Ala Thr Ser Glu Lys Ser Phe Glu 
65 70 75 80 

Ser Glu Pro Gin Ala Phe Asp Pro Lys Ser He Leu Asp Ser Val Lys 

85 90 95 



Asn Ser Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val He 
50 100 105 HO 

Gly Thr Ala Leu Ser He He Ser Val Ser Leu Leu Ala Val Glu Lys 
115 120 125 



51 



WO 01/79472 PCT7US01/12334 



lie Ser Asp He Ser Pro Leu Phe Phe Thr Gly Val Leu Glu Ala Val 
130 135 140 

5 Val Ala Ala Leu Phe Met Asn He Tyr He Val Gly Leu Asn Gin Leu 
145 150 155 160 

Ser Asp Val Glu He Asp Lys He Asn Lys Pro Tyr Leu Pro Leu Ala 

165 170 175 

10 

Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr He Val Ala Ser Phe 

180 185 190 

Ser He Leu Ser Phe Trp Leu Gly Trp Val Val Gly Ser Trp Pro Leu 
15 195 200 205 

Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr Ala Tyr Ser He 
210 215 220 

20 Asn Val Pro Leu Leu Arg Trp Lys Arg Phe Ala Val Leu Ala Ala Met 
225 230 235 240 

Cys He Leu Ala Val Arg Ala Val He Val Gin Leu Ala Phe Phe Leu 

245 250 255 

25 

His He Gin Thr His Val Tyr Lys Arg Pro Pro Val Phe Ser Arg Ser 

260 265 270 

Leu He Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Val Val He Ala 
30 275 280 285 

Leu Phe Lys Asp He Pro Asp He Glu Gly Asp Lys Val Phe Gly He 
290 295 300 

35 Gin Ser Phe Ser Val Arg Leu Gly Gin Lys Pro Val Phe Trp Thr Cys 
305 310 315 320 

Val lie Leu Leu Glu He Ala Tyr Gly Val Ala Leu Leu Val Gly Ala 

325 330 335 

40 

Ala Ser Pro Cys Leu Trp Ser Lys He Val Thr Gly Leu Gly His Ala 

340 345 350 

Val Leu Ala Ser He Leu Trp Phe His Ala Lys Ser Val Asp Leu Lys 
45 355 360 365 

Ser Lys Ala Ser He Thr Ser Phe Tyr Met Phe He Trp Lys Leu Phe 
370 375 380 

50 Tyr Ala Glu Tyr Leu Leu He Pro Phe Val Arg 
385 390 395 

<210> 98 

<211> 411 

55 <212> PRT 

<213> SOY 
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<400> 98 



Met Asp Ser Leu Leu Leu Arg Ser Phe Pro Asn He Asn Asn Ala Ser 
5 1 5 10 15 

Ser Leu Thr Thr Thr Gly Ala Asn Phe Ser Arg Thr Lys Ser Phe Ala 

20 25 30 

10 Asn He Tyr His Ala Ser Ser Tyr Val Pro Asn Ala Ser Trp His Asn 

35 40 45 

Arg Lys He Gin Lys Glu Tyr Asn Phe Leu Arg Phe Arg Trp Pro Ser 
50 55 60 

15 

Leu Asn His His Tyr Lys Gly He Glu Gly Ala Cys Thr Cys Lys Lys 
65 70 75 80 

* 

Cys Asn He Lys Phe Val Val Lys Ala Thr Ser Glu Lys Ser Leu Glu 
20 85 90 95 

> 

Ser Glu Pro Gin Ala Phe Asp Pro Lys Ser He Leu Asp Ser Val Lys 

100 105 110 

25 Asn Ser Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val He 

115 120 125 

Gly Thr Ala Leu Ser He He Ser Val Ser Leu Leu Ala Val Glu Lys 
130 135 140 

30 

He Ser Asp He Ser Pro Leu Phe Phe Thr Gly Val Leu Glu Ala Val 
145 150 155 160 

Val Ala Ala Leu Phe Met Asn He Tyr He Val Gly Leu Asn Gin Leu 
35 165 170 175 

* 

Ser Asp Val Glu He Asp Lys He Asn Lys Pro Tyr Leu Pro Leu Ala 

180 185 190 

40 Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr He Val Ala Ser Phe 

195 200 205 

Ser He Leu Ser Phe Trp Leu Gly Trp Val Val Gly Ser Trp Pro Leu 
210 215 220 

45 

Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr Ala Tyr Ser He 
225 230 235 240 

Asn Val Pro Leu Leu Arg Trp Lys Arg Phe Ala Val Leu Ala Ala Met 
50 245 250 255 

Cys He Leu Ala Val Arg Ala Val He Val Gin Leu Ala Phe Phe Leu 

260 265 270 

55 His Met Gin Thr His Val Tyr Lys Arg Pro Pro Val Phe Ser Arg Pro 

275 280 285 
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Leu lie Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Val Val lie Ala 
290 295 300 

5 Leu Phe Lys Asp lie Pro Asp lie Glu Gly Asp Lys Val Phe Gly lie 
305 310 315 320 

Gin Ser Phe Ser Val Arg Leu Gly Gin Lys Pro Val Phe Trp Thr Cys 

325 330 335 



10 



25 



30 



35 



40 



45 



50 



Val Thr Leu Leu Glu lie Ala Tyr Gly Val Ala Leu Leu Val Gly Ala 

340 345 350 



Ala Ser Pro Cys Leu Trp Ser Lys He Phe Thr Gly Leu Gly His Ala 
15 355 360 365 

Val Leu Ala Ser lie Leu Trp Phe His Ala Lys Ser Val Asp Leu Lys 
370 375 380 

■ 

20 Ser Lys Ala Ser He Thr Ser Phe Tyr Met Phe He Trp Lys Leu Phe 
385 390 395 400 



Tyr Ala Glu Tyr Leu Leu He Pro Phe Val Arg 

405 410 



<210> 99 

• 

<211> 964 
<212> DNA 
<213> RICE 



<400> 99 














gagcagcact 


gggtcttaca 


ttccaatgga 


gctcgcctgt 


tgctttcatt 


acatgcttcg 


60 


tgactttatt 


tgctttggtc 


attgctataa 


ccaaagatct 


cccagatgtt 


gaaggggatc 


120 


ggaagtatca 


aatatcaact 


ttggcgacaa 


agctcggtgt 


cagaaacatt 


gcatttcttg 


180 


gctctggttt 


attgatagca 


aattatgttg 


ctgctattgc 


tgtagctttt 


ctcatgcctc 


240 


aggctttcag 


gcgcactgta 


atggtgcctg 


tgcatgctgc 


ccttgccgtt 


ggtataattt 


300 


tccagacatg 


ggttctggag 


caagcaaaat 


atactaagga 


tgctatttca 


cagtactacc 


360 


ggttcatttg 


gaatctcttc 


tatgctgaat 


acatcttctt 


cccgttgata 


tagagaccaa 


420 


gcaatctgat 


atggtctgca 


tgttgagtgc 


ggcaaaaact 


agaagcccat 


atgaacagtg 


480 


ggagtagggg 


aacgaacatg 


ccatccatgg 


gaagactctg 


ataactctct 


ctcgcccggg 


540 


ctgtaaaggg 


taagcactgt 


tgggcatata 


tatgaaagga 


aggtgataaa 


gcagggatgc 


600 


taaattgcta 


ctgggatcct 


caaaggctta 


tagtggtcac 


cagtggaatg 


tgccttaata 


660 


atttggttac 


ccagcagagc 


aagtttttgc 


aggttattag gtaatatctt 


tgagggaatg 


720 
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aacttagatt tcattgtttt aaggtctggt 
aaaaacgacc ttgttttaca ctaccaaggg 

5 

accttgagag ttgagaccat ggaatcactt 
tcagcatttg cattctcctc cccacttgta 
10 gtgt 

<210> 100 

<211> 421 

15 <212> DNA 

<213> WHEAT 

<400> 100 

cgtccgcgga cgcgtgggtg cttattcagt 

20 

tgctgttgtt gcagcactct gcatattagc 
ttttctccac attcagacat ttgttttcag 
25 atttgcaact gccttcatga cattcttctc 
cgatattgaa ggggaccgca tctttggaat 
caggggtttc tggacttgcg ttggcctact 

30 

gggggtaact tcttccagtt tgtggagcaa 
c 

35 

<210> 101 
<211> 705 
<212> DNA 
<213> LEEK 

40 

<400> 101 

gtttcccccc ctcgaatttt tttttttttt 
taaaaaagac aaagaaaacc actggatatc 

45 

tgataatctt taacacaaca tacaacatga 
ttgaaagaac tctccgtttt taagatgaca 
50 gcctccatta tctactcatc ttctcttgcg 
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cacacaacgg gtagtagtgc tggagcggca 780 
aggttaactc tagttttcat gtgaccactt 840 
gtcgactcct cggcttgtat atttctagtg 900 
cttgaaaagt tgaagacaac ttttttgttt 960 

964 



caatctgccg cactttctat ggaagagatc 60 

agtgcgtgcg gtgatagttc aactggcatt 120 

aaggccggca gacttttcaa agccattgat 180 

agttgtaata gcattattca aggatatacc 240 

ccaatctttt agtggtagac taggtcaaag 300 

tgaggttgcc tacggtgttg cgatactgag 360 

atctataact gttgtgggcc atgcaatcct 420 

421 



ttttacttca tttttctgtg aataaattct 60 

ctaaattcaa cataggctat tgtcattcaa 120 

atataattaa ggagaaatga tctgcaattg 180 

attaaagcgt tgttaattcc agccatttct 240 

attcttttcc atgtaggtca taaaccctca 300 
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tcttacaaaa ggaatgagca agt act cage atagaagagc ttccacacga acatataaaa 



360 



agatgtaata gtggttttgg tcattggtcc atgagatcta gcacgattcc aaagtaacga 



420 



cccaagaatt gcatgaccta tcactgttaa geatttgetc cataggcatg aggaagtagc 



480 



tccaacaacc atgacaacag tgtaggecat ctcaaggaga tatatacata tccaaaacac 



540 



cctctcctgg ccaaggcgca cgctgaaaga atggatgcca aatattttgt ctccgtctat 



600 



atcaggtata tccttaaata gagcaataac aactgagaag aagctcatga aggcagttgc 



660 



aaatatcaat ggccttgtga aacttgetgg tcttttgaaa acaaa 



705 



<210> 102 

<211> 637 

<212> DNA 

<213> LEEK 

<220> 

<221> mis cofeature 

<222> (1)..<637) 

<223> n = g, a, t or c 

<400> 102 

natteggcac gagttttgaa gaagttaagc atggactccc tccttaccaa gccagttgta 60 

atacctctgc cttctccagt ttgttcacta ccaatcttgc gaggcagttc tgcaccaggg 120 

cagtattcat gtagaaacta caatccaata agaattcaaa ggtgcctcgt aaattatgaa 180 

catgtgaaac caaggtttac aacatgtagt aggtctcaaa aacttggtca tgtaaaagee 240 

acatccgagc attctttaga atctggatcc gaaggataca ctcctagaag catatgggaa 300 

geegtactag cttcactgaa tgttctatac aaattttcac gacctcacac aataatagga 360 

acagcaatgg gcataatgtc agtttctttg cttgttgtcg agagectate cgatatttct 420 

cctctgtttt ttgtgggatt' attagaggct gtggttgctg cattgtttat gaatgtttac 480 

attgtaggtc tgaatcaatt atttgacata gaaatagaca aggtcaataa acctgatctt 540 

cctcttgcat ctggagaata ctcaccaaga gctggtactg ctattgtcat tgcttcagcc 600 

atcatgagct ttggcattgg atggttagtt ggctctt 637 
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<210> 103 

<211> 677 

<212> DNA 

5 <213> CANOIiA 

<400> 103 

tttttttttt tttttttcaa aaagaccaat 
10 tgtctccaag ctacaaagaa gaagaagaga 
atgaatgcta gaagaagggg aataacagat 
ggtaatatcc tgctatagct tcctttgtgt 

15 

aaaccaagca tgaagccaag atcatatgtg 
gaggcatgta gaaagctagt gatatggcag 
20 cgaggaatgc aatgttcctc actccaagct 
gatctccttc aacatcagga agatcttttg 
caaaagacgt gatgaaagcc acaggtgcac 

25 

tagtagcatg gtacacacca aaattaagaa 
acgctgcaac tggaaatctc ttcattctaa 
30 cggacgcgtg ggtcgac 

<210> 104 

<211> 1431 

35 <212> DNA 

<213> CORN 

<400> 104 

ccacgcgtcc gcccggccaa gggatggacg 

40 

tgcggcccgg cgcggcccgc ccgcgagatc 
gaaatggtga aggacgaatt tgcttttcta 
45 accatcagaa attcttcgaa tggaaatcct 
atacttctgt taafcgcttcg gggcaacagc 
caaccatctg gagggcaata tcatcttctc 

50 



PCTAJS01/12334 



cctttagtat gtacatgaac aaagtgattt 60 

ggtatacaaa gaaaactaca aatgttcacc 120 

actctgcgta gaagagattc catataaacc 180 

agtttgcttt ttctagcacc catgtctgga 240 

caggaatcat caagctacct ctaaaaacct 300 

aaatatagtt cactagcaga agtccagaac 360 

ttgttgctag tgttgatatt tggaacttgc 420 

taatagcaat gactagtgca aacagtgtca 480 

tccactgaaa cgaaagtcca agagcagctc 540 

gaaaacctcg taccgtggca ataataagaa 600 

atggtggaac agaatagatg gtccccagat 660 

677 



cgcttcgcct acggccgtcc ctcctccccg 60 

attttctacc accatgttgt tccatacaac 120 

gccaaaggac ccaaggtcct accttgcatc 180 

cctattgtag gatatcacat cggtcattaa 240 

tgcagtctga acctgaaaca catgattcta 300 

tagatgcatt ttacagattt tcccggccac 360 



57 



WO 01/79472 PCT/US01/12334 





atactgtcat 


aggaacagca 


ttaagcatag tctcagtttc ccttctagct 


gtccagagct 


. 420 




tgtctgatat 


atcacctttg 


ttcctcactg gtttgctgga ggcagtggta 


gctgcccttt 


480 


5 


tcatgaatat 


ctatattgtt 


ggactgaacc agttattcga cattgagata 


gacaaggtta 


540 




acaagccaac 


tcttccattg 


gcatctgggg aatacaccct tgcaactggg 


gttgcaatag 


600 


10 


tttcggtctt 


tgccgctatg 


agctttggcc ttggatgggc tgttggatca 


caacctctgt 


660 




tttgggctct 


tttcataagc 


tttgttcttg ggactgcata ttcaatcaat 


ctgccgtacc 


720 




ttcgatggaa 


gagatttgct 


gttgttgcag cactgtgcat attagcagtt 


cgtgcagtga 


780 


15 


ttgttcagct 


ggcctttttt 


ctccacattc agacttttgt tttcaggaga 


ccggcagtgt 


840 




tttctaggcc 


attattattt 


gcaactggat ttatgacgtt cttctctgtt 


gtaatagcac 


900 


20 


tattcaagga 


tatacctgac 


atcgaagggg accgcatatt cgggatccga 


tccttcagcg 


960 


tccggttagg 


gcaaaagaag 


gtcttttgga tctgcgttgg cttgcttgag 


atggcctaca • 


1020 




gcgttgcgat 


actgatggga 


gctacctctt cctgtttgtg gagcaaaaca 


gcaaccatcg 


1080 


25 


ctggccattc 


catacttgcc 


gcgatcctat ggagctgcgc gcgatcggtg 


gacttgacga 


1140 




gcaaagccgc 


aataacgtcc 


ttctacatgt tcatctggaa gctgttctac 


gcggagtacc 


1200 


30 


tgctcatccc 


tctggtgcgg 


tgagcgcgag gcgaggtggt ggcagacgga 


tcggcgtcgg 


1260 


cggggcggca 


aacaactcca 


cqqgaqaact tqaqtqccqq aaqtaaactc 


ccqtttqaaa 


1320 




gttgaagcgt 


gcaccaccgg 


caccgggcag agagagacac ggtggctgga 


tggatacgga 


1380 


35 


tggccccccc 


aataaattcc 

■ 


cccgtgcatg gtaaaaaaaa aaaaaaaaaa 


a 


1431 




<210> 105 




■ 








<211> 1870 








40 


<212> DNA 












<213> CORN 










<400> 105 










45 


gccgcgcagc 


gcgacgagcg 


ccacctgctt gctgccgcgt gcctgcgtgc 


gtgtgcgtcc 


60 


accactgacc 


ccgcgcccgc 


ccaccocccc tacccchcca cfcccacfcfccre 


traoh fort" err 


X ^ \J 




cggcccgctt 


cccccccggc 


caagggatgg acgcgcttcg cctacggccg 


tccctcctcc 


180 


50 


ccgtgcggcc 


cggcgcggcc 


cgcccgcgag atcattttct accaccatgt 


tgttccatac 


240 




aacgaaatgg tgaaggacga 


atttgctttt ctagccaaag gacccaaggt 


cctaccttgc 


300 




atcaccatca gaaattcttc 


gaatggaaat cctcctattg taggatatca 


categgtcat 


360 
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taaatacttc tgttaatgct tcggggcaac agctgcagtc tgaacctgaa acacatgatt 420 

ctacaaccat ctggagggca atatcatctt ctctagatgc attttacaga ttttcccggc 480 

5 

cacatactgt cataggaaca gcattaagca tagtctcagt ttcccttcta gctgtccaga 540 

gcttgtctga tatatcacct ttgttcctca ctggtttgct ggaggcagtg gtagctgccc 600 

10 ttttcatgaa tatctatatt gttggactga accagttatt cgacattgag atagacaagg 660 

ttaacaagcc aactcttcca ttggcatctg gggaatacac ccttgcaact ggggttgcaa 720 

tagtttcggt ctttgccgct atgagctttg gccttggatg ggctgttgga tcacaacctc 780 

15 

tgttttgggc tcttttcata agctttgttc ttgggactgc atattcaatc aatctgccgt 840 

accttcgatg gaagagattt gctgttgttg cagcactgtg catattagca gttcgtgcag 900 

20 tgattgttca gctggccttt tttctccaca ttcagacttt tgttttcagg agaccggcag 960 

tgttttctag gccattatta tttgcaactg gatttatgac gttcttctct gttgtaatag 1020 

cactattcaa ggatatacct gacatcgaag gggaccgcat attcgggatc cgatccttca 1080 

25 

gcgtccggtt agggcaaaag aaggtctttt ggatctgcgt tggcttgctt gagatggcct 1140 

acagcgttgc gatactgatg ggagctacct cttcctgttt gtggagcaaa acagcaacca 1200 

30 tcgctggcca ttccatactt gccgcgatcc tatggagctg cgcgcgatcg gtggacttga 1260 

cgagcaaagc cgcaataacg tccttctaca tgttcatctg gaagctgttc tacgcggagt 1320 

acctgctcat ccctctggtg cggtgagcgc gaggcgaggt ggtggcagac ggatcggcgt 1380 

35 

cggcggggcg gcaaacaact ccacgggaga acttgagtgc cggaagtaaa ctcccgtttg 1440 

aaagttgaag cgtgcaccac cggcaccggg cagagagaga cacggtggct ggatggatac 1500 

40 ggatggcccc cccaataaat tcccccgtgc atggtacccc acgctgcttg atgatatccc 1560 

atgtgtccgg gtgaccggac ctgatcgtct ctagagagat tggttgcaca acgtccaaca 1620 

■ 

tagcccgtag gtattgctac cactgctagt atgatactcc ttcctagtcc ttgccagcac 1680 

45 

cagtgaccca aacttggtcg gctgagctca gcgctcagca gctttacgtg catctgcgcc 1740 

ttgacttgtg cagtgggcgt cgctagcatg aatgatgtat ggtgcgtcac ggcctgacgg 1800 

50 ttcgtcagtc tgggccgtgt tttgtgtccg aggaagatcg tctgtcagag atctggattg 1860 

cctcgctgct 1870 

55 <210> 106 
<211> 642 



59 



WO 01/79472 



PCT/US01/12334 



10 



<212> DNA 
<213> CORK 

<400> 106 

5 cggccggact cttctgactt ggcaaccgcc gcgcagcgcg acgagcgcca cctgcttgct 60 

gccgcgtgcc tgcgtgcgtg tgcgtccacc actgaccccg cgcccgcccg ccgcccctgc 120 

ccctccactc cacttgctca ctcgtcggct cgtcgcggcc cgcttccccc ccggccaagg 180 

gatggacgcg cttcgcctac ggccgtccct cctccccgtg cggcccggcg cggcccgccc 240 

gcgaggcagt ggtagctgcc cttttcatga atatctatat tgttggactg aaccagttat 300 

15 tcgacattga gatagacaag gttaacaagc caactcttcc attggcatct ggggaataca 360 

cccttgcaac tggggttgca atagtttcgg tctttgccgc tatgagcttt ggccttggat 420 

gggctgttgg atcacaacct ctgttttggg ctcttttcat aagctttgtt cttgggactg 480 

catattcaat caatctgccg taccttcgat ggaagagatt tgctgttgtt gcagcactgt 540 

gcatattagc agttcgtgca gtgattgttc agctggcctt ttttctccac attcagactt 600 

25 ttgttttcag gagaccggca gtgttttcta ggccattatt at 642 



20 



35 



<210> 107 

<211> 362 

30 <212> DNA 

<213> COTTON 

<400> 107 

cccacgcgtc cgaacattgt ttgcacttgt tattgccata accaaggatc ttccagatgt 60 

agaaggagat cgcaaatttc aaatatcaac attagcaaca aagcttggag ttagaaatat 120 

tgcatttctt ggttccggac ttctactggt gaattatgtt gctgctgtgt tggctgcaat 180 

40 atacatgcct caggctttca ggcgtagttt aatgatacct gctcatatct ttttggcggt 240 

ctgcttgatt tttcagacat gggtgttgga acaagcaaat tacaaaaagg aagcaatctc 300 

ggggttctat cgtttcatat ggaatctctt ctatgcagag tatgcgattt tccccttcgt 360 

gt 362 



45 



<210> 108 
50 <211> 575 



60 



WO 01/79472 

<212> DNA 
<213> TOMATO 

<400> 108 
5 cagatcaatt ccagttcctg ctgagttttc 

acgggttttg aaatgtaaag catggaagag 

gttgcagcgg cagtatatca cgcaagagca 

10 

tgctgataaa aaacttaaag ggagattttt 
atctcaacct tctaaaagtc cttgggactc 
15 gttctcgcgg ccccatacca taataggaac 
tgcagttgag aagttctctg atttttctcc 
tgttgctgcc ctattcatga acatttacat 

20 

aatagacaag gtaaacaagc catatcttcc 
tggagtgatt gttgtgtcgt cttttgccat 

25 

<210> 109 

<211> 1663 

<212> DNA 

<213> ARABIDOPSIS 

30 

<400> 109 

aacaccaaac acacaatttc acattctttt 
gatacggagc ttgattgttt ctatgaaccc 

35 

tgtatctcct ctcactcgct cactagttcc 
catttctagg gggatcccgt cgatctccac 
40 taaacctgtt tacgtcccga cgtctcccaa 
ccatttcgat ggaacacctc ggaagttctt 
agagaagagg gagagttttt gttttatgta 

45 

tttgtcacca ttggaagtgg ctctatatgg 
tcttggcgct aatgataaat atttatgcca 
50 agatcgacat gagctagttt tggggaatac 
aaacaaggag gttccaccag aggaatttaa 



PCT7US01/12334 



tccactcaaa accagttcac atgcaatagt 60 

accaaaaaag cactattcct cttcaatgaa 120 

tgttggagga agtgatctaa gcactattgc 180 

ggtgcacgca tcatctgaac accctcttga 240 

agttaatgat gccgtagatg ctttctacag 300 

agcattgagc ataatttcag tttctctcct 360 

attatttttc actggggtgt tagaggccat 420 

agttggttta aaccagttgt ctgacatcga 480 

attggcatca ggggaatact ctgtacaaac 540 

tttga 575 



gcatatttct tcttcttctt ccattatgga 60 

taatttatct tcctttgagc tctctcgccc 120 

gttccgatcg actaaactag ttccccgctc 180 

cccgaatagt gaaactgaca agatctccgt 240 

tcgcgaactc cggactcctc acagtggata 300 

cgagggatgg tggatccggg tttccatccc 360 

ttctgtggag aatcctgcat ttcggcagag 420 

acctagattc actggtgttg gagctcagat 480 

atacgaacaa gactctcaca atttctgggg 540 

ttttagtgct gtgccaggcg caaaggctcc 600 

cagaagagtg tccgaagggt tccaagctac 660 
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15 



25 



tccattttgg 


catcaaggtc 


acatttgcga 


tgatggccgt 


actgactatg 


cggaaactgt 


720 


gaaatctgct 


cgttgggagt 


atagtactcg 


tcccgtttac 


ggttggggtg 


atgttggggc 


780 


caaacagaag 


tcaactgcag 


gctggcctgc 


agcttttcct 


gtatttgagc 


ctcattggca 


B40 


gatatgcatg 


gcaggaggcc 


tttccacagg 


gtggatagaa 


tggggcggtg 


aaaggtttga 


900 


gtttcgggat 


gcaccttctt 


attcagagaa 


gaattggggt 


ggaggcttcc 


caagaaaatg 


960 


gttttgggtc 


cagtgtaatg 


tctttgaagg 

• 


ggcaactgga 


gaagttgctt 


taaccgcagg 


1020 


tggcgggttg 


aggcaattgc 


ctggattgac 


tgagacctat 


gaaaatgctg 


cactggtttg 


1080 


tgtacactat 


gatggaaaaa 


tgtacgagtt 


tgttccttgg 

« 


aatggtgttg 


ttagatggga 


1140 


aatgtctccc 


tggggttatt 


ggtatataac 


tgcagagaac 


gaaaaccatg 


tggtggaact 


1200 


agaggcaaga 


acaaatgaag 


cgggtacacc 


tctgcgtgct 

-V 


cctaccacag 


aagttgggct 


1260 


agctacggct 


tgcagagata 


gttgttacgg 


tgaattgaag 


ttgcagatat 


gggaacggct 


1320 


atatgatqqa 


aqtaaaqgca 


aggtgatatt 


agagacaaag 


agctcaatgg 


cagcagtgga 


1380 


gataggagga 


ggaccgtggt 


ttgggacatg 


gaaaggagat 


acgagcaaca 


cgcccgagct 


1440 


actaaaacag 


gctcttcagg 


tcccattgga 


tcttgaaagc 


gccttaggtt 


tggtcccttt 


1500 


cttcaagcca 


ccgggtctgt 


aacattgatg 


agtgttttgt 


ttgttgatag 


agacccatgt 


1560 


gatgaatgaa 


gccttagtca 


tgtcattgct 


agcttcacta 


ttatgtatgt 


atgattttag 


1620 


ttcgttcggt 


ccttgtggta 


aatgatacgg 


gccagtgtaa 


agt 




1663 



35 



40 



<210> 110 

<211> 488 

<212> PRT 

<213> ARABIDOPSIS 



<400> 110 



Met Glu lie Arg Ser Leu. lie Val Ser Met Asn Pro Asn Leu Ser Ser 
45 1 5 10 15 

Phe Glu Leu Ser Arg Pro Val Ser Pro Leu Thr Arg Ser Leu Val Pro 

20 25 30 

50 Phe Arg Ser Thr Lys Leu Val Pro Arg Ser lie Ser Arg Val Ser Ala 

35 40 45 



Ser lie Ser Thr Pro Asn Ser Glu Thr Asp Lys He Ser Val Lys Pro 
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50 



55 



60 



Val Tyr Val Pro Thr Ser Pro Asn Arg Glu Leu Arg Thr Pro His Ser 
65 70 75 80 

Gly Tyr His Phe Asp Gly Thr Pro Arg Lys Phe Phe Glu Gly Trp Tyr 

85 90 95 

Phe Arg Val Ser lie Pro Glu Lys Arg Glu Ser Phe Cys Phe Met Tyr 

100 105 110 

Ser Val Glu Asn Pro Ala Phe Arg Gin Ser Leu Ser Pro Leu Glu Val 
115 120 125 

Ala Leu Tyr Gly Pro Arg Phe Thr Gly Val Gly Ala Gin lie Leu Gly 
130 135 140 

Ala Asn Asp Lys Tyr Leu Cys Gin Tyr Glu Gin Asp Ser His Asn Phe 
145 150 155 160 

Trp Gly Asp Arg His Glu Leu Val Leu Gly Asn Thr Phe Ser Ala Val 

165 170 175 

Pro Gly Ala Lys Ala Pro Asn Lys Glu Val Pro Pro Glu Glu Phe Asn 

180 185 190 

Arg Arg Val Ser Glu Gly Phe Gin Ala Thr Pro Phe Trp His Gin Gly 
195 200 205 

His He Cys Asp Asp Gly Arg Thr Asp Tyr Ala Glu Thr Val Lys Ser 
210 215 220 

Ala Arg Trp Glu Tyr Ser Thr Arg Pro Val Tyr Gly Trp Gly Asp Val 
225 230 235 . 240 

Gly Ala Lys Gin Lys Ser Thr Ala Gly Trp Pro Ala Ala Phe Pro Val 

245 250 255 

Phe Glu Pro His Trp Gin He Cys Met Ala Gly Gly Leu Ser Thr Gly 

260 265 270 

Trp He Glu Trp Gly Gly Glu Arg Phe Glu Phe Arg Asp Ala Pro Ser 
275 280 285 

Tyr Ser Glu Lys Asn Trp Gly Gly Gly Phe Pro Arg Lys Trp Phe Trp 
290 295 300 

Val Gin Cys Asn Val Phe Glu Gly Ala Thr Gly Glu Val Ala Leu Thr 
305 310 315 320 

Ala Gly Gly Gly Leu Arg Gin Leu Pro Gly Leu Thr Glu Thr Tyr Glu 

325- 330 335 

Asn Ala Ala Leu Val Cys Val His Tyr Asp Gly Lys Met Tyr Glu Phe 

340 345 350 



Val Pro Trp Asn Gly Val Val Arg Trp Glu Met Ser Pro Trp Gly Tyr 
355 360 365 
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Trp Tyr lie Thr Ala Glu Asn Glu Asn His Val Val Glu Leu Glu Ala 
370 375 380 

5 Arg Thr Asn Glu Ala Gly Thr Pro Leu Arg Ala Pro Thr Thr Glu Val 
385 390 395 400 

Gly Leu Ala Thr Ala Cys Arg Asp Ser Cys Tyr Gly Glu Leu Lys Leu 

405 410 415 

10 

Gin lie Trp Glu Arg Leu Tyr Asp Gly Ser Lys Gly Lys Val lie Leu 

420 425 430 

Glu Thr Lys Ser Ser Met Ala Ala Val Glu lie Gly Gly Gly Pro Trp 
15 435 440 445 

Phe Gly Thr Trp Lys Gly Asp Thr Ser Asn Thr Pro Glu Leu Leu Lys 
450 455 460 

20 Gin Ala Leu Gin Val Pro Leu Asp Leu Glu Ser Ala Leu Gly Leu Val 
465 470 475 480 

Pro Phe Phe Lys Pro Pro Gly Leu 

485 

25 
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